How To Pass Parameters To Mapper Through Main Method

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How To Pass Parameters To Mapper Through Main Method

Boyu Zhang
Dear All,

I am implementing a clustering algorithm in which I need to compare each
line to two specific lines (they all have the same format ) and output two
scores denoting the similarity between each line to the two specific lines.

Can I define two global variables (the 2 specific lines) in the main[]
method and pass those two variables to the mapper class?
Or can I store the two lines in a separate file (say Centric )and have
mapper class read the file and compare each lines (from other files, say
Data in which the data need to be processed) with the two from the separate
file Centric?

Thanks a lot for reading my email, really appreciate any help!

Boyu Zhang(Emma)
University of Delaware
Reply | Threaded
Open this post in threaded view
|

Re: How To Pass Parameters To Mapper Through Main Method

Amogh Vasekar
Hi,
Many options available here. You can use jobconf (0.18 ) / context.conf (0.20) to pass these lines across all tasks ( assuming the size isnt relatively large ) and use configure / setup to retrieve these.. Or use distributed cache to read a file containing these lines ( possibly with jvm reuse if you want that extra bit as well. )

Thanks,
Amogh

On 10/26/09 6:17 AM, "Boyu Zhang" <[hidden email]> wrote:

Dear All,

I am implementing a clustering algorithm in which I need to compare each
line to two specific lines (they all have the same format ) and output two
scores denoting the similarity between each line to the two specific lines.

Can I define two global variables (the 2 specific lines) in the main[]
method and pass those two variables to the mapper class?
Or can I store the two lines in a separate file (say Centric )and have
mapper class read the file and compare each lines (from other files, say
Data in which the data need to be processed) with the two from the separate
file Centric?

Thanks a lot for reading my email, really appreciate any help!

Boyu Zhang(Emma)
University of Delaware

Reply | Threaded
Open this post in threaded view
|

Re: How To Pass Parameters To Mapper Through Main Method

Boyu Zhang
Dear Amogh,

Thank you for the tip, I tried with jobconf and configure, it worked! Thanks
a lot!

Boyu

On Mon, Oct 26, 2009 at 12:09 AM, Amogh Vasekar <[hidden email]> wrote:

> Hi,
> Many options available here. You can use jobconf (0.18 ) / context.conf
> (0.20) to pass these lines across all tasks ( assuming the size isnt
> relatively large ) and use configure / setup to retrieve these.. Or use
> distributed cache to read a file containing these lines ( possibly with jvm
> reuse if you want that extra bit as well. )
>
> Thanks,
> Amogh
>
> On 10/26/09 6:17 AM, "Boyu Zhang" <[hidden email]> wrote:
>
> Dear All,
>
> I am implementing a clustering algorithm in which I need to compare each
> line to two specific lines (they all have the same format ) and output two
> scores denoting the similarity between each line to the two specific lines.
>
> Can I define two global variables (the 2 specific lines) in the main[]
> method and pass those two variables to the mapper class?
> Or can I store the two lines in a separate file (say Centric )and have
> mapper class read the file and compare each lines (from other files, say
> Data in which the data need to be processed) with the two from the separate
> file Centric?
>
> Thanks a lot for reading my email, really appreciate any help!
>
> Boyu Zhang(Emma)
> University of Delaware
>
>