Best way to Merge small XML files

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Best way to Merge small XML files

shujamughal
Hi Folks,

I am having hundreds of small xml files coming each hour. The size varies
from 5 Mb to 15 Mb. As Hadoop did not work well with small files so i want
to merge these small files. So what is the best option to merge these xml
files?



--
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>
Reply | Threaded
Open this post in threaded view
|

Re: Best way to Merge small XML files

madhu phatak
Hi
You can write an InputFormat which create input splits from multiple files .
It will solve your problem.

On Wed, Feb 2, 2011 at 4:04 PM, Shuja Rehman <[hidden email]> wrote:

> Hi Folks,
>
> I am having hundreds of small xml files coming each hour. The size varies
> from 5 Mb to 15 Mb. As Hadoop did not work well with small files so i want
> to merge these small files. So what is the best option to merge these xml
> files?
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> <http://pk.linkedin.com/in/shujamughal>
>
Reply | Threaded
Open this post in threaded view
|

Re: Best way to Merge small XML files

Kai Voigt
Did you look into Hadoop Archives?

http://hadoop.apache.org/mapreduce/docs/r0.21.0/hadoop_archives.html

Kai

Am 03.02.2011 um 11:44 schrieb madhu phatak:

> Hi
> You can write an InputFormat which create input splits from multiple files .
> It will solve your problem.
>
> On Wed, Feb 2, 2011 at 4:04 PM, Shuja Rehman <[hidden email]> wrote:
>
>> Hi Folks,
>>
>> I am having hundreds of small xml files coming each hour. The size varies
>> from 5 Mb to 15 Mb. As Hadoop did not work well with small files so i want
>> to merge these small files. So what is the best option to merge these xml
>> files?
>>
>>
>>
>> --
>> Regards
>> Shuja-ur-Rehman Baig
>> <http://pk.linkedin.com/in/shujamughal>
>>

--
Kai Voigt
[hidden email]