Quantcast

Hadoop Streaming (with Python) and Queue's

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Hadoop Streaming (with Python) and Queue's

eric.brose
Hey all,
We just added queue's to our capacity scheduler and now (we did not set a default.. which it appears we might have to change)
if i try and run a simple streaming job i get the following error.
10/07/14 11:03:02 ERROR streaming.StreamJob: Error Launching job : java.io.IOException: Queue "default" does not exist
        at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2998)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

Streaming Job Failed!

been playing around with adding my queue name (with the generic -D option) to the streaming command but have had no luck
e.g.
bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer /dev/reducer.py -input DEV/input/* -output DEV/output/ -D mapred.queue.names="dev"

with this i get the following error

10/07/14 10:54:49 ERROR streaming.StreamJob: Unrecognized option: -D


i've tried something similar to one of the examples in the streaming documentation

bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer /dev/reducer.py -input DEV/input/* -output DEV/output/ -D mapred.reduce.tasks=2

and still get the error
ERROR streaming.StreamJob: Unrecognized option: -D

Any assistance would be greatly appreciated! Thanks ahead of time!
-eric
ps using version 0.20.2 on RHEL servers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hadoop Streaming (with Python) and Queue's

Moritz Krog
I second that observation, I c&p'ed most of the -D options directly from the
tutorial and found the same error message.

I'm sorry I can't help you, Eric

On Wed, Jul 14, 2010 at 6:25 PM, eric.brose <[hidden email]> wrote:

>
> Hey all,
> We just added queue's to our capacity scheduler and now (we did not set a
> default.. which it appears we might have to change)
> if i try and run a simple streaming job i get the following error.
> 10/07/14 11:03:02 ERROR streaming.StreamJob: Error Launching job :
> java.io.IOException: Queue "default" does not exist
>        at
> org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2998)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
> Streaming Job Failed!
>
> been playing around with adding my queue name (with the generic -D option)
> to the streaming command but have had no luck
> e.g.
> bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file
> /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer
> /dev/reducer.py -input DEV/input/* -output DEV/output/ -D
> mapred.queue.names="dev"
>
> with this i get the following error
>
> 10/07/14 10:54:49 ERROR streaming.StreamJob: Unrecognized option: -D
>
>
> i've tried something similar to one of the examples in the streaming
> documentation
>
> bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file
> /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer
> /dev/reducer.py -input DEV/input/* -output DEV/output/ -D
> mapred.reduce.tasks=2
>
> and still get the error
> ERROR streaming.StreamJob: Unrecognized option: -D
>
> Any assistance would be greatly appreciated! Thanks ahead of time!
> -eric
> ps using version 0.20.2 on RHEL servers
> --
> View this message in context:
> http://hadoop-common.472056.n3.nabble.com/Hadoop-Streaming-with-Python-and-Queue-s-tp966968p966968.html
> Sent from the Users mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hadoop Streaming (with Python) and Queue's

Ted Yu-3
If you're using capacity scheduler, see:
http://hadoop.apache.org/common/docs/r0.20.2/capacity_scheduler.html#Setting+up+queues

The queues can be checked through job tracker web UI under Scheduling
Information section

On Wed, Jul 14, 2010 at 9:57 AM, Moritz Krog <[hidden email]>wrote:

> I second that observation, I c&p'ed most of the -D options directly from
> the
> tutorial and found the same error message.
>
> I'm sorry I can't help you, Eric
>
> On Wed, Jul 14, 2010 at 6:25 PM, eric.brose <[hidden email]> wrote:
>
> >
> > Hey all,
> > We just added queue's to our capacity scheduler and now (we did not set a
> > default.. which it appears we might have to change)
> > if i try and run a simple streaming job i get the following error.
> > 10/07/14 11:03:02 ERROR streaming.StreamJob: Error Launching job :
> > java.io.IOException: Queue "default" does not exist
> >        at
> > org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2998)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> >        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
> >        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at javax.security.auth.Subject.doAs(Subject.java:396)
> >        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> >
> > Streaming Job Failed!
> >
> > been playing around with adding my queue name (with the generic -D
> option)
> > to the streaming command but have had no luck
> > e.g.
> > bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file
> > /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer
> > /dev/reducer.py -input DEV/input/* -output DEV/output/ -D
> > mapred.queue.names="dev"
> >
> > with this i get the following error
> >
> > 10/07/14 10:54:49 ERROR streaming.StreamJob: Unrecognized option: -D
> >
> >
> > i've tried something similar to one of the examples in the streaming
> > documentation
> >
> > bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file
> > /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer
> > /dev/reducer.py -input DEV/input/* -output DEV/output/ -D
> > mapred.reduce.tasks=2
> >
> > and still get the error
> > ERROR streaming.StreamJob: Unrecognized option: -D
> >
> > Any assistance would be greatly appreciated! Thanks ahead of time!
> > -eric
> > ps using version 0.20.2 on RHEL servers
> > --
> > View this message in context:
> >
> http://hadoop-common.472056.n3.nabble.com/Hadoop-Streaming-with-Python-and-Queue-s-tp966968p966968.html
> > Sent from the Users mailing list archive at Nabble.com.
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hadoop Streaming (with Python) and Queue's

Amareshwari Sri Ramadasu
In reply to this post by Moritz Krog
-D options (which is a generic option) should be moved before the command specific options.
The syntax is
Bin/hadoop jar streaming.jar <generic options> <command options>

Thanks
Amareshwari

On 7/14/10 10:27 PM, "Moritz Krog" <[hidden email]> wrote:

I second that observation, I c&p'ed most of the -D options directly from the
tutorial and found the same error message.

I'm sorry I can't help you, Eric

On Wed, Jul 14, 2010 at 6:25 PM, eric.brose <[hidden email]> wrote:

>
> Hey all,
> We just added queue's to our capacity scheduler and now (we did not set a
> default.. which it appears we might have to change)
> if i try and run a simple streaming job i get the following error.
> 10/07/14 11:03:02 ERROR streaming.StreamJob: Error Launching job :
> java.io.IOException: Queue "default" does not exist
>        at
> org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2998)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
> Streaming Job Failed!
>
> been playing around with adding my queue name (with the generic -D option)
> to the streaming command but have had no luck
> e.g.
> bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file
> /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer
> /dev/reducer.py -input DEV/input/* -output DEV/output/ -D
> mapred.queue.names="dev"
>
> with this i get the following error
>
> 10/07/14 10:54:49 ERROR streaming.StreamJob: Unrecognized option: -D
>
>
> i've tried something similar to one of the examples in the streaming
> documentation
>
> bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -file
> /dev/mapper.py -mapper /dev/mapper.py -file /dev/reducer.py -reducer
> /dev/reducer.py -input DEV/input/* -output DEV/output/ -D
> mapred.reduce.tasks=2
>
> and still get the error
> ERROR streaming.StreamJob: Unrecognized option: -D
>
> Any assistance would be greatly appreciated! Thanks ahead of time!
> -eric
> ps using version 0.20.2 on RHEL servers
> --
> View this message in context:
> http://hadoop-common.472056.n3.nabble.com/Hadoop-Streaming-with-Python-and-Queue-s-tp966968p966968.html
> Sent from the Users mailing list archive at Nabble.com.
>

Loading...