separate JVM flags for map and reduce tasks

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

separate JVM flags for map and reduce tasks

Vasilis Liaskovitis
Hi,

I 'd like to pass different JVM options for map tasks and different
ones for reduce tasks. I think it should be straightforward to add
mapred.mapchild.java.opts, mapred.reducechild.java.opts to my
conf/mapred-site.xml and process the new options accordingly in
src/mapred/org/apache/mapreduce/TaskRunner.java . Let me know if you
think it's more involved than what I described.

My question is: if mapred.job.reuse.jvm.num.tasks is set to -1 (always
reuse), can the same JVM be re-used for different types of tasks? So
the same JVM being used e.g. first by a map task and then used by
reduce task. I am assuming this is definitely possible, though I
haven't verified in the code.
So , if one wants to pass different jvm options to map tasks and
reduce tasks, perhaps jobs.reuse.jvm.num.task should be set to 1
(never reuse) ?

thanks for your help,

- Vasilis
Reply | Threaded
Open this post in threaded view
|

Re: separate JVM flags for map and reduce tasks

Hemanth Yamijala
Vasilis,

> I 'd like to pass different JVM options for map tasks and different
> ones for reduce tasks. I think it should be straightforward to add
> mapred.mapchild.java.opts, mapred.reducechild.java.opts to my
> conf/mapred-site.xml and process the new options accordingly in
> src/mapred/org/apache/mapreduce/TaskRunner.java . Let me know if you
> think it's more involved than what I described.

In trunk, (I haven't checked in earlier versions), there are already
options such as mapreduce.map.java.opts and
mapreduce.reduce.java.opts. Strangely, these are not documented in
mapred-default.xml, though the option mapred.child.java.opts is
deprecated in favor of the other two options. Please refer to
MAPREDUCE-478 for details.

>
> My question is: if mapred.job.reuse.jvm.num.tasks is set to -1 (always
> reuse), can the same JVM be re-used for different types of tasks? So
> the same JVM being used e.g. first by a map task and then used by
> reduce task. I am assuming this is definitely possible, though I
> haven't verified in the code.

Nope. JVMs are not reused across types. o.a.h.mapred.JvmManager has
the relevant information. There's a JvmManagerForType inner class to
which all reuse related calls are delegated and that is per type. In
particular, launchJVM which is the basic method that triggers a reuse
or spawns a new JVM, operates based on the task type.

> So , if one wants to pass different jvm options to map tasks and
> reduce tasks, perhaps jobs.reuse.jvm.num.task should be set to 1
> (never reuse) ?
>

Given the above, this is not necessary. You can reuse JVMs and pass
separate parameters to the respective task types.