How to monitor YARN application memory per container?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

How to monitor YARN application memory per container?

Shmuel Blitz
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Sidharth Kumar
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Shmuel Blitz
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Sunil G
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Shmuel Blitz
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Sunil G
Hi Shmuel

This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 alpha's. 

Thanks
Sunil

On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <[hidden email]> wrote:
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Sunil G
And adding to that, we have aggregated container usage per node. I dont think you ll have a per container real memory usage recorded from YARN.
You ll have these 2 entries in ideal cases.

Resource Utilization by Node : 
Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0

Thanks
Sunil

On Thu, Jun 15, 2017 at 6:56 AM Sunil G <[hidden email]> wrote:
Hi Shmuel

This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 alpha's. 

Thanks
Sunil

On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <[hidden email]> wrote:
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Naganarasimha Garla-2
Container resource usage has been put into ATS v2 metrics system. But if you do not want heavy ATS v2 subsystem, then i am not sure any of the current interface exposing the actual resource usage of the container which solves your problem.
Probably i can think of extending this feature in ContainerManagementProtocol.getContainerStatuses, so that atleast AM can be aware of the actual container resource usages. 
Thoughts ?

On Thu, Jun 15, 2017 at 7:29 PM, Sunil G <[hidden email]> wrote:
And adding to that, we have aggregated container usage per node. I dont think you ll have a per container real memory usage recorded from YARN.
You ll have these 2 entries in ideal cases.

Resource Utilization by Node : 
Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0

Thanks
Sunil

On Thu, Jun 15, 2017 at 6:56 AM Sunil G <[hidden email]> wrote:
Hi Shmuel

This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 alpha's. 

Thanks
Sunil

On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <[hidden email]> wrote:
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us

Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Shmuel Blitz
Hi,

Thanks for your response.

We are using CDH, and our version doesn't support the solusions above.
Also, ATS is not relevant for us now.

We have decided to turn on JMX for all our jobs (spark/hadoop map-reduce) and use jmap to collect the data and send it to datadog.

Shmuel



On Thu, Jun 15, 2017 at 9:39 PM, Naganarasimha Garla <[hidden email]> wrote:
Container resource usage has been put into ATS v2 metrics system. But if you do not want heavy ATS v2 subsystem, then i am not sure any of the current interface exposing the actual resource usage of the container which solves your problem.
Probably i can think of extending this feature in ContainerManagementProtocol.getContainerStatuses, so that atleast AM can be aware of the actual container resource usages. 
Thoughts ?

On Thu, Jun 15, 2017 at 7:29 PM, Sunil G <[hidden email]> wrote:
And adding to that, we have aggregated container usage per node. I dont think you ll have a per container real memory usage recorded from YARN.
You ll have these 2 entries in ideal cases.

Resource Utilization by Node : 
Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0

Thanks
Sunil

On Thu, Jun 15, 2017 at 6:56 AM Sunil G <[hidden email]> wrote:
Hi Shmuel

This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 alpha's. 

Thanks
Sunil

On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <[hidden email]> wrote:
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us




--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us
Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Jasson Chenwei
hi, 

Please take a look at Timeline Server 2 which supports aggregate nodemenager side info into HBase. 
These infos include both node level info(e.g., node memory usage, cpu usage) as well as caontainer(e.g., container memory usage and container cpu usage ) level info.  I am currently trying to set it up and do find container related infos stored in HBase.


Wei Chen

On Thu, Jun 22, 2017 at 8:12 AM, Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

We are using CDH, and our version doesn't support the solusions above.
Also, ATS is not relevant for us now.

We have decided to turn on JMX for all our jobs (spark/hadoop map-reduce) and use jmap to collect the data and send it to datadog.

Shmuel



On Thu, Jun 15, 2017 at 9:39 PM, Naganarasimha Garla <[hidden email]> wrote:
Container resource usage has been put into ATS v2 metrics system. But if you do not want heavy ATS v2 subsystem, then i am not sure any of the current interface exposing the actual resource usage of the container which solves your problem.
Probably i can think of extending this feature in ContainerManagementProtocol.getContainerStatuses, so that atleast AM can be aware of the actual container resource usages. 
Thoughts ?

On Thu, Jun 15, 2017 at 7:29 PM, Sunil G <[hidden email]> wrote:
And adding to that, we have aggregated container usage per node. I dont think you ll have a per container real memory usage recorded from YARN.
You ll have these 2 entries in ideal cases.

Resource Utilization by Node : 
Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0

Thanks
Sunil

On Thu, Jun 15, 2017 at 6:56 AM Sunil G <[hidden email]> wrote:
Hi Shmuel

This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 alpha's. 

Thanks
Sunil

On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <[hidden email]> wrote:
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us




--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us

Reply | Threaded
Open this post in threaded view
|

Re: How to monitor YARN application memory per container?

Miklos Szegedi
Hello,

MAPREDUCE-6829 was about showing the peak memory usage for mapreduce.
Here are some of the new counters:

[root@42e243b8cf16 hadoop]# bin/yarn jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-....jar pi 1 1000

Number of Maps  = 1

Samples per Map = 1000

...

Peak Map Physical memory (bytes)=274792448

Peak Map Virtual memory (bytes)=2112589824

Peak Reduce Physical memory (bytes)=167776256

Peak Reduce Virtual memory (bytes)=2117087232

...

Estimated value of Pi is 3.14800000000000000000

Thanks,

Miklos

On Thu, Jun 22, 2017 at 10:21 AM, Jasson Chenwei <[hidden email]> wrote:
hi, 

Please take a look at Timeline Server 2 which supports aggregate nodemenager side info into HBase. 
These infos include both node level info(e.g., node memory usage, cpu usage) as well as caontainer(e.g., container memory usage and container cpu usage ) level info.  I am currently trying to set it up and do find container related infos stored in HBase.


Wei Chen

On Thu, Jun 22, 2017 at 8:12 AM, Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

We are using CDH, and our version doesn't support the solusions above.
Also, ATS is not relevant for us now.

We have decided to turn on JMX for all our jobs (spark/hadoop map-reduce) and use jmap to collect the data and send it to datadog.

Shmuel



On Thu, Jun 15, 2017 at 9:39 PM, Naganarasimha Garla <[hidden email]> wrote:
Container resource usage has been put into ATS v2 metrics system. But if you do not want heavy ATS v2 subsystem, then i am not sure any of the current interface exposing the actual resource usage of the container which solves your problem.
Probably i can think of extending this feature in ContainerManagementProtocol.getContainerStatuses, so that atleast AM can be aware of the actual container resource usages. 
Thoughts ?

On Thu, Jun 15, 2017 at 7:29 PM, Sunil G <[hidden email]> wrote:
And adding to that, we have aggregated container usage per node. I dont think you ll have a per container real memory usage recorded from YARN.
You ll have these 2 entries in ideal cases.

Resource Utilization by Node : 
Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0

Thanks
Sunil

On Thu, Jun 15, 2017 at 6:56 AM Sunil G <[hidden email]> wrote:
Hi Shmuel

This feature is available in Hadoop 2.8 + release lines. Or Hadoop 3 alpha's. 

Thanks
Sunil

On Wed, Jun 14, 2017 at 6:31 AM Shmuel Blitz <[hidden email]> wrote:
Hi Sunil,

Thanks for your response.

Here is the response I get when running  "yarn node -status {nodeId}" :

Node Report :                                                    
        Node-Id : myNode:4545                    
        Rack : /default                                          
        Node-State : RUNNING                                     
        Node-Http-Address : muNode:8042          
        Last-Health-Update : Wed 14/Jun/17 08:25:43:261EST       
        Health-Report :                                          
        Containers : 7                                           
        Memory-Used : 44032MB                                    
        Memory-Capacity : 49152MB                                
        CPU-Used : 16 vcores                                     
        CPU-Capacity : 48 vcores                                 
        Node-Labels :                                     

However, this is information regarding the entire node, containing all containers.

I have no way of using this to see the value I give to 'spark.executor.memory' makes sense or not.

I'm looking for memory usage/allocated information per-container.

Shmuel 

On Wed, Jun 14, 2017 at 4:04 PM, Sunil G <[hidden email]> wrote:
Hi Shmuel

In Hadoop 2.8 release line, you could check "yarn node -status {nodeId}" CLI command or "http://<rm http address:port>/ws/v1/cluster/nodes/{nodeid}" REST end point to get container's actual resource usage per node. You could also check the same in any of Hadoop 3.0 alpha releases as well.

Thanks
Sunil

On Tue, Jun 13, 2017 at 11:29 PM Shmuel Blitz <[hidden email]> wrote:
Hi,

Thanks for your response.

The /metrics API returns a blank page on our RM.

The /jmx API has some metrics, but these are the same metrics we are already loading into data-dog.
It's not good enough, because it doesn't break down the memory use by container.

I need the by-container breakdown because resource allocation is per container and I would like to se if my job is really using up all the allocated memory.

Shmuel

On Tue, Jun 13, 2017 at 6:05 PM, Sidharth Kumar <[hidden email]> wrote:
Hi,

I guess you can get it from http://<resourcemanager-host>:<rm-port>/jmx or /metrics 

Regards

On 13-Jun-2017 6:26 PM, "Shmuel Blitz" <[hidden email]> wrote:
(This question has also been published on StackOveflow)

I am looking for a way to monitor memory usage of YARN containers over time.

Specifically - given a YARN application-id, how can you get a graph, showing the memory usage of each of its containers over time?

The main goal is to better fit memory allocation requirements for our YARN applications (Spark / Map-Reduce), to avoid over allocation and cluster resource waste. A side goal would be the ability to debug memory issues when developing our jobs and attempting to pick reasonable resource allocations.

We've tried using the Data-Dog integration, But it doesn't break down the metrics by container.

Another approach was to parse the hadoop-yarn logs. These logs have messages like:

Memory usage of ProcessTree 57251 for container-id container_e116_1495951495692_35134_01_000001: 1.9 GB of 11 GB physical memory used; 14.4 GB of 23.1 GB virtual memory used
Parsing the logs correctly can yield data that can be used to plot a graph of memory usage over time.

That's exactly what we want, but there are two downsides:

It involves reading human-readable log lines and parsing them into numeric data. We'd love to avoid that.
If this data can be consumed otherwise, we're hoping it'll have more information that we might be interest in in the future. We wouldn't want to put the time into parsing the logs just to realize we need something else.
Is there any other way to extract these metrics, either by plugging in to an existing producer or by writing a simple listener?

Perhaps a whole other approach?

--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us



--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us




--
Logo
Shmuel Blitz
Big Data Developer
www.similarweb.com
Like Us
Follow Us
Watch Us
Read Us