YARN SLS : OutOfMemoryError: unable to create new native thread yarn

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

YARN SLS : OutOfMemoryError: unable to create new native thread yarn

赵思晨(思霖)
Hi,
I am running 200+ jobs, and each job contains 100 tasks, when i using slsrun.sh to start SLS.
it came out error:

2018-07-24 04:47:27,957 INFO capacity.CapacityScheduler: Added node 11.178.150.104:1604 clusterResource: <memory:821760000, vCores:15408000, disk: 6099000000M, resource2: 8025G>
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
        at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1617)
        at org.apache.hadoop.yarn.sls.scheduler.TaskRunner.start(TaskRunner.java:157)
        at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:247)
        at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:950)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:957)

I set the Xmx and Xms in Hadoop-env.sh: -Xmx20480m, -Xms20480m, but still doesn't work.

Anyone help me?

thanks inadvance

Sichen
Reply | Threaded
Open this post in threaded view
|

Re: YARN SLS : OutOfMemoryError: unable to create new native thread yarn

Clay B.
Hi Sichen,

I would expect you are running out of mmap ranges on most stock Linux
kernels. (Each thread takes a mmap slot.) You can increase your
vm.max_map_count[1] to see if that helps.

-Clay

[1]: A discussion on effecting the change:
https://www.systutorials.com/241561/maximum-number-of-mmaped-ranges-and-how-to-set-it-on-linux/

On Tue, 24 Jul 2018, 赵思晨(思霖) wrote:

> Hi,
> I am running 200+ jobs, and each job contains 100 tasks, when i using slsrun.sh
> to start SLS.
> it came out error:
>
> 2018-07-24 04:47:27,957 INFO capacity.CapacityScheduler: Added node 11.178.150
> .104:1604 clusterResource: <memory:821760000, vCores:15408000, disk: 609900000
> 0M, resource2: 8025G>
> Exception in thread "main" java.lang.OutOfMemoryError: unable to create new na
> tive thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:717)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecuto
> r.java:957)
>         at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(Thre
> adPoolExecutor.java:1617)
>         at org.apache.hadoop.yarn.sls.scheduler.TaskRunner.start(TaskRunner.ja
> va:157)
>         at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:247)
>         at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:950)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:957)
>
> I set the Xmx and Xms in Hadoop-env.sh: -Xmx20480m, -Xms20480m, but still
> doesn't work.
>
> Anyone help me?
>
> thanks inadvance
>
> Sichen
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

回复:YARN SLS : OutOfMemoryError: unable to create new native thread yarn

赵思晨(思霖)
In reply to this post by 赵思晨(思霖)
Hi Clay,All
Thank you for your response and help.
I already solved that.
i found that there are many reasons for this issue:  the OutOfMemoryError: unable to create new native thread.
the mmap maybe one of the reasons. But it's not the problem i met.( i use docker to run sls).

the key for my issue is:
SLS-runner.xml
  <property>
    <name>yarn.sls.runner.pool.size</name>
    <value>100000</value>
  </property>
the value i set is too large.

in the TaskRunner.java#155: https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L155

executor = new ThreadPoolExecutor(threadPoolSize, threadPoolSize, 0,
      TimeUnit.MILLISECONDS, queue);

ThreadPoolExecutor:
ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue)
 
So the both corePoolSize and maximumPoolSize are set the same size:threadPoolSize.  So if the users set the value of yarn.sls.runner.pool.size too large. the corePoolSize will cause OutOfMemoryError error.
And in fact, for the sake of convenience and insurance, users will always set a big value.

Here is my solution:

1. I think we need set a max_PoolSize in slsconfiguration.java. and take it as the second parameter:
executor = new ThreadPoolExecutor(threadPoolSize, max_PoolSize, 0,
      TimeUnit.MILLISECONDS, queue);

2. We also can judge the size of input threadPoolSize, if lager than max_PoolSize, throws an error or warn to remind users.

I can submit an patch for my solution. What do you think?



------------------------------------------------------------------


Hi Sichen,

I would expect you are running out of mmap ranges on most stock Linux 
kernels. (Each thread takes a mmap slot.) You can increase your 
vm.max_map_count[1] to see if that helps.

-Clay

[1]: A discussion on effecting the change: 
https://www.systutorials.com/241561/maximum-number-of-mmaped-ranges-and-how-to-set-it-on-linux/

On Tue, 24 Jul 2018, 赵思晨(思霖) wrote:

> Hi,
> I am running 200+ jobs, and each job contains 100 tasks, when i using slsrun.sh
> to start SLS.
> it came out error:

> 2018-07-24 04:47:27,957 INFO capacity.CapacityScheduler: Added node 11.178.150
> .104:1604 clusterResource: <memory:821760000, vCores:15408000, disk: 609900000
> 0M, resource2: 8025G>
> Exception in thread "main" java.lang.OutOfMemoryError: unable to create new na
> tive thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:717)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecuto
> r.java:957)
>         at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(Thre
> adPoolExecutor.java:1617)
>         at org.apache.hadoop.yarn.sls.scheduler.TaskRunner.start(TaskRunner.ja
> va:157)
>         at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:247)
>         at org.apache.hadoop.yarn.sls.SLSRunner.run(SLSRunner.java:950)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:957)

> I set the Xmx and Xms in Hadoop-env.sh: -Xmx20480m, -Xms20480m, but still
> doesn't work.

> Anyone help me?

> thanks inadvance

> Sichen

>