Can I have multiple reducers?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Can I have multiple reducers?

Forhadoop

Hello,

In my application I need to reduce the original reducer output keys further.

I was reading about Chainreducer and Chainmappers but looks like it is for :
one or more mapper -> reducer -> 0 or more mappers

I need something like:
one or more mapper -> reducer -> reducer

Please help me figure out the best way to achieve it. Currently, the only
options seems like I write another map reduce application and run it
separately after the first map-reduce application. In this second
application, the mapper will be dummy and won't do anything. The reducer
will further club the first run outputs.

Any other comments such as this is not a good programming practice are
welcome, so that I know I am in the wrong direction..
--
View this message in context: http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Can I have multiple reducers?

Aaron Kimball
If you need another shuffle after your first reduce pass, then you need a
second MapReduce job to run after the first one. Just use an IdentityMapper.

This is a reasonably common situation.
- Aaron

On Thu, Oct 22, 2009 at 4:17 PM, Forhadoop <[hidden email]> wrote:

>
> Hello,
>
> In my application I need to reduce the original reducer output keys
> further.
>
> I was reading about Chainreducer and Chainmappers but looks like it is for
> :
> one or more mapper -> reducer -> 0 or more mappers
>
> I need something like:
> one or more mapper -> reducer -> reducer
>
> Please help me figure out the best way to achieve it. Currently, the only
> options seems like I write another map reduce application and run it
> separately after the first map-reduce application. In this second
> application, the mapper will be dummy and won't do anything. The reducer
> will further club the first run outputs.
>
> Any other comments such as this is not a good programming practice are
> welcome, so that I know I am in the wrong direction..
> --
> View this message in context:
> http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Can I have multiple reducers?

Amandeep Khurana
If you haven't already done so, you can also explore using combiners.
Not sure if that'll solve your problem since all your k,v pairs for a
given key k won't get aggregated at one place...

On 10/22/09, Aaron Kimball <[hidden email]> wrote:

> If you need another shuffle after your first reduce pass, then you need a
> second MapReduce job to run after the first one. Just use an IdentityMapper.
>
> This is a reasonably common situation.
> - Aaron
>
> On Thu, Oct 22, 2009 at 4:17 PM, Forhadoop <[hidden email]> wrote:
>
>>
>> Hello,
>>
>> In my application I need to reduce the original reducer output keys
>> further.
>>
>> I was reading about Chainreducer and Chainmappers but looks like it is for
>> :
>> one or more mapper -> reducer -> 0 or more mappers
>>
>> I need something like:
>> one or more mapper -> reducer -> reducer
>>
>> Please help me figure out the best way to achieve it. Currently, the only
>> options seems like I write another map reduce application and run it
>> separately after the first map-reduce application. In this second
>> application, the mapper will be dummy and won't do anything. The reducer
>> will further club the first run outputs.
>>
>> Any other comments such as this is not a good programming practice are
>> welcome, so that I know I am in the wrong direction..
>> --
>> View this message in context:
>> http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>>
>


--


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
Reply | Threaded
Open this post in threaded view
|

Re: Can I have multiple reducers?

Amogh Vasekar
In reply to this post by Aaron Kimball
Hi,
On what parameters does the output key of your (first) reducer depend?

Amogh

On 10/23/09 8:24 AM, "Aaron Kimball" <[hidden email]> wrote:

If you need another shuffle after your first reduce pass, then you need a
second MapReduce job to run after the first one. Just use an IdentityMapper.

This is a reasonably common situation.
- Aaron

On Thu, Oct 22, 2009 at 4:17 PM, Forhadoop <[hidden email]> wrote:

>
> Hello,
>
> In my application I need to reduce the original reducer output keys
> further.
>
> I was reading about Chainreducer and Chainmappers but looks like it is for
> :
> one or more mapper -> reducer -> 0 or more mappers
>
> I need something like:
> one or more mapper -> reducer -> reducer
>
> Please help me figure out the best way to achieve it. Currently, the only
> options seems like I write another map reduce application and run it
> separately after the first map-reduce application. In this second
> application, the mapper will be dummy and won't do anything. The reducer
> will further club the first run outputs.
>
> Any other comments such as this is not a good programming practice are
> welcome, so that I know I am in the wrong direction..
> --
> View this message in context:
> http://www.nabble.com/Can-I-have-multiple-reducers--tp26018722p26018722.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>