Re: editing etc hosts files of a cluster

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: editing etc hosts files of a cluster

Allen Wittenauer
Any time you deal with pushing files around, you also have to deal with the
repercussions of when the file fails to get to its destination or it fails
to get there in a timely manner. [Hai hadoop config files.] If you use an
interface alias/vip/multi-a/whatever to deal with namenode availability,
then the host information becomes even more critical.

Rather than build something custom, I chose to use well known, off the shelf
software to deal with keeping host information relatively in-sync:  bind.


On 10/19/09 8:09 PM, "David B. Ritch" <[hidden email]> wrote:

> Most of the communication and name lookups within a cluster refer to
> other nodes within that same cluster.  It is usually not a big deal to
> put all the systems from a cluster in a single hosts file, and rsync it
> around the cluster.  (Consider using prsync, which comes with pssh,
> http://www.theether.org/pssh/, or your favorite cluster management
> software.)
> Editing each individually clearly doesn't scale; but editing it once and
> replicating it does.
>
> Is a large hosts file less efficient than nscd or a caching DNS server
> for nodes within the cluster?
>
> Thanks,
>
> David
>
> On 10/19/2009 8:02 PM, Edward Capriolo wrote:
>> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
>> <[hidden email]> wrote:
>>  
>>>
>>>
>>> On 10/19/09 11:46 AM, "Edward Capriolo" <[hidden email]> wrote:
>>>
>>>    
>>>> I am interested in your post. What has caused you to run caching DNS
>>>> servers on each of your nodes? Is this a hadoop specific problem or a
>>>> problem  specific to your implementation?
>>>>      
>>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't have
>>> either nscd or a local DNS caching server, you are likely throwing what
>>> could be some significant performance gains away.
>>>
>>>    
>>>> My assumption here is that a hadoop cluster of say 1000 nodes would
>>>> repeatedly talk to the same 1000 nodes.
>>>>      
>>> ... and that's the catch!  Every node running the DFSClient code or being
>>> called out from a map/reduce task is a potential hostname that would need be
>>> resolved.  Just think about something like distcp.
>>>
>>> Also note that this is before we talk about monitoring, any other naming
>>> services, CNAMEs, multi-As, etc, that get built as a normal part of running
>>> an infrastructure.
>>>
>>>    
>>>> Are you saying that nscd is
>>>> inadequacy to handle the size of the cache, or nscd is not very
>>>> efficient? What exactly is the reason you are running a caching DNS
>>>> server on each node?
>>>>      
>>> In the case of Yahoo!, we had (or, at least, a perception) that we had or
>>> were going to have jobs that did a lot of direct DNS lookups and/or
>>> accessed/referenced things outside of the local grid.  Also note that a DNS
>>> caching server is going to store more information about hostnames than a
>>> simple host to IP service like nscd.
>>>
>>> Hypothetical:  Let's say I'm building rules for a spam filter and part of my
>>> process is to look up the MX record for a given host.  nscd isn't going to
>>> help you there.
>>>
>>> In the case of LinkedIn, the jury is still out.  I suspect we don't have
>>> nscd.conf tuned correctly.  Our grid isn't that big, our connections in/out
>>> are fairly small, etc. It has been one of the things on my todo list since I
>>> got hired here 2 months ago. :)
>>>
>>> [For the record, I'm not one of those crazy people who turns off nscd
>>> because I had a bad experience with a  broken version five years ago.  In
>>> the case of Yahoo!, I was the crazy person who started insisting we turn it
>>> on, albeit not for hosts.]
>>>
>>>
>>>    
>> Cool thanks for the info.
>>
>> I have found NSCD to be absolutely essential in most/all situations.
>> Whenever I would truss processes on OS'es without NSCD (say freebsd
>> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
>> /etc/group.
>>
>> If you are doing users and groups through LDAP nscd is super important
>> as well. Your not going to want to make a series of lookups each stat.
>>
>> I would think the most efficient implementation would be nscd and a
>> local caching server in that case. NSCD should be very efficient since
>> it is done through libraries, dns lookups have to open sockets
>> (overhead). However I can see your point nscd can not do other types
>> of records.
>>
>>  
>

Reply | Threaded
Open this post in threaded view
|

Re: editing etc hosts files of a cluster

David B. Ritch
I also prefer to avoid custom software, and follow standards.  We use Puppet
to manage our node configuration (including hadoop config files), and adding
one more file to the configuration is trivial.

I prefer not to run additional daemons on all my nodes when I can avoid it.
Replicating our hosts file allows us to avoid running named on all the
nodes.

David

On Tue, Oct 20, 2009 at 1:15 PM, Allen Wittenauer
<[hidden email]>wrote:

> Any time you deal with pushing files around, you also have to deal with the
> repercussions of when the file fails to get to its destination or it fails
> to get there in a timely manner. [Hai hadoop config files.] If you use an
> interface alias/vip/multi-a/whatever to deal with namenode availability,
> then the host information becomes even more critical.
>
> Rather than build something custom, I chose to use well known, off the
> shelf
> software to deal with keeping host information relatively in-sync:  bind.
>
>
> On 10/19/09 8:09 PM, "David B. Ritch" <[hidden email]> wrote:
>
> > Most of the communication and name lookups within a cluster refer to
> > other nodes within that same cluster.  It is usually not a big deal to
> > put all the systems from a cluster in a single hosts file, and rsync it
> > around the cluster.  (Consider using prsync, which comes with pssh,
> > http://www.theether.org/pssh/, or your favorite cluster management
> > software.)
> > Editing each individually clearly doesn't scale; but editing it once and
> > replicating it does.
> >
> > Is a large hosts file less efficient than nscd or a caching DNS server
> > for nodes within the cluster?
> >
> > Thanks,
> >
> > David
> >
> > On 10/19/2009 8:02 PM, Edward Capriolo wrote:
> >> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
> >> <[hidden email]> wrote:
> >>
> >>>
> >>>
> >>> On 10/19/09 11:46 AM, "Edward Capriolo" <[hidden email]> wrote:
> >>>
> >>>
> >>>> I am interested in your post. What has caused you to run caching DNS
> >>>> servers on each of your nodes? Is this a hadoop specific problem or a
> >>>> problem  specific to your implementation?
> >>>>
> >>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't
> have
> >>> either nscd or a local DNS caching server, you are likely throwing what
> >>> could be some significant performance gains away.
> >>>
> >>>
> >>>> My assumption here is that a hadoop cluster of say 1000 nodes would
> >>>> repeatedly talk to the same 1000 nodes.
> >>>>
> >>> ... and that's the catch!  Every node running the DFSClient code or
> being
> >>> called out from a map/reduce task is a potential hostname that would
> need be
> >>> resolved.  Just think about something like distcp.
> >>>
> >>> Also note that this is before we talk about monitoring, any other
> naming
> >>> services, CNAMEs, multi-As, etc, that get built as a normal part of
> running
> >>> an infrastructure.
> >>>
> >>>
> >>>> Are you saying that nscd is
> >>>> inadequacy to handle the size of the cache, or nscd is not very
> >>>> efficient? What exactly is the reason you are running a caching DNS
> >>>> server on each node?
> >>>>
> >>> In the case of Yahoo!, we had (or, at least, a perception) that we had
> or
> >>> were going to have jobs that did a lot of direct DNS lookups and/or
> >>> accessed/referenced things outside of the local grid.  Also note that a
> DNS
> >>> caching server is going to store more information about hostnames than
> a
> >>> simple host to IP service like nscd.
> >>>
> >>> Hypothetical:  Let's say I'm building rules for a spam filter and part
> of my
> >>> process is to look up the MX record for a given host.  nscd isn't going
> to
> >>> help you there.
> >>>
> >>> In the case of LinkedIn, the jury is still out.  I suspect we don't
> have
> >>> nscd.conf tuned correctly.  Our grid isn't that big, our connections
> in/out
> >>> are fairly small, etc. It has been one of the things on my todo list
> since I
> >>> got hired here 2 months ago. :)
> >>>
> >>> [For the record, I'm not one of those crazy people who turns off nscd
> >>> because I had a bad experience with a  broken version five years ago.
>  In
> >>> the case of Yahoo!, I was the crazy person who started insisting we
> turn it
> >>> on, albeit not for hosts.]
> >>>
> >>>
> >>>
> >> Cool thanks for the info.
> >>
> >> I have found NSCD to be absolutely essential in most/all situations.
> >> Whenever I would truss processes on OS'es without NSCD (say freebsd
> >> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
> >> /etc/group.
> >>
> >> If you are doing users and groups through LDAP nscd is super important
> >> as well. Your not going to want to make a series of lookups each stat.
> >>
> >> I would think the most efficient implementation would be nscd and a
> >> local caching server in that case. NSCD should be very efficient since
> >> it is done through libraries, dns lookups have to open sockets
> >> (overhead). However I can see your point nscd can not do other types
> >> of records.
> >>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: editing etc hosts files of a cluster

Allen Wittenauer

Everything can get made to work in a small scale.  As the grid grows,
well...


On 10/20/09 10:32 AM, "David Ritch" <[hidden email]> wrote:

> I also prefer to avoid custom software, and follow standards.  We use Puppet
> to manage our node configuration (including hadoop config files), and adding
> one more file to the configuration is trivial.
>
> I prefer not to run additional daemons on all my nodes when I can avoid it.
> Replicating our hosts file allows us to avoid running named on all the
> nodes.
>
> David
>
> On Tue, Oct 20, 2009 at 1:15 PM, Allen Wittenauer
> <[hidden email]>wrote:
>
>> Any time you deal with pushing files around, you also have to deal with the
>> repercussions of when the file fails to get to its destination or it fails
>> to get there in a timely manner. [Hai hadoop config files.] If you use an
>> interface alias/vip/multi-a/whatever to deal with namenode availability,
>> then the host information becomes even more critical.
>>
>> Rather than build something custom, I chose to use well known, off the
>> shelf
>> software to deal with keeping host information relatively in-sync:  bind.
>>
>>
>> On 10/19/09 8:09 PM, "David B. Ritch" <[hidden email]> wrote:
>>
>>> Most of the communication and name lookups within a cluster refer to
>>> other nodes within that same cluster.  It is usually not a big deal to
>>> put all the systems from a cluster in a single hosts file, and rsync it
>>> around the cluster.  (Consider using prsync, which comes with pssh,
>>> http://www.theether.org/pssh/, or your favorite cluster management
>>> software.)
>>> Editing each individually clearly doesn't scale; but editing it once and
>>> replicating it does.
>>>
>>> Is a large hosts file less efficient than nscd or a caching DNS server
>>> for nodes within the cluster?
>>>
>>> Thanks,
>>>
>>> David
>>>
>>> On 10/19/2009 8:02 PM, Edward Capriolo wrote:
>>>> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
>>>> <[hidden email]> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 10/19/09 11:46 AM, "Edward Capriolo" <[hidden email]> wrote:
>>>>>
>>>>>
>>>>>> I am interested in your post. What has caused you to run caching DNS
>>>>>> servers on each of your nodes? Is this a hadoop specific problem or a
>>>>>> problem  specific to your implementation?
>>>>>>
>>>>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't
>> have
>>>>> either nscd or a local DNS caching server, you are likely throwing what
>>>>> could be some significant performance gains away.
>>>>>
>>>>>
>>>>>> My assumption here is that a hadoop cluster of say 1000 nodes would
>>>>>> repeatedly talk to the same 1000 nodes.
>>>>>>
>>>>> ... and that's the catch!  Every node running the DFSClient code or
>> being
>>>>> called out from a map/reduce task is a potential hostname that would
>> need be
>>>>> resolved.  Just think about something like distcp.
>>>>>
>>>>> Also note that this is before we talk about monitoring, any other
>> naming
>>>>> services, CNAMEs, multi-As, etc, that get built as a normal part of
>> running
>>>>> an infrastructure.
>>>>>
>>>>>
>>>>>> Are you saying that nscd is
>>>>>> inadequacy to handle the size of the cache, or nscd is not very
>>>>>> efficient? What exactly is the reason you are running a caching DNS
>>>>>> server on each node?
>>>>>>
>>>>> In the case of Yahoo!, we had (or, at least, a perception) that we had
>> or
>>>>> were going to have jobs that did a lot of direct DNS lookups and/or
>>>>> accessed/referenced things outside of the local grid.  Also note that a
>> DNS
>>>>> caching server is going to store more information about hostnames than
>> a
>>>>> simple host to IP service like nscd.
>>>>>
>>>>> Hypothetical:  Let's say I'm building rules for a spam filter and part
>> of my
>>>>> process is to look up the MX record for a given host.  nscd isn't going
>> to
>>>>> help you there.
>>>>>
>>>>> In the case of LinkedIn, the jury is still out.  I suspect we don't
>> have
>>>>> nscd.conf tuned correctly.  Our grid isn't that big, our connections
>> in/out
>>>>> are fairly small, etc. It has been one of the things on my todo list
>> since I
>>>>> got hired here 2 months ago. :)
>>>>>
>>>>> [For the record, I'm not one of those crazy people who turns off nscd
>>>>> because I had a bad experience with a  broken version five years ago.
>>  In
>>>>> the case of Yahoo!, I was the crazy person who started insisting we
>> turn it
>>>>> on, albeit not for hosts.]
>>>>>
>>>>>
>>>>>
>>>> Cool thanks for the info.
>>>>
>>>> I have found NSCD to be absolutely essential in most/all situations.
>>>> Whenever I would truss processes on OS'es without NSCD (say freebsd
>>>> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
>>>> /etc/group.
>>>>
>>>> If you are doing users and groups through LDAP nscd is super important
>>>> as well. Your not going to want to make a series of lookups each stat.
>>>>
>>>> I would think the most efficient implementation would be nscd and a
>>>> local caching server in that case. NSCD should be very efficient since
>>>> it is done through libraries, dns lookups have to open sockets
>>>> (overhead). However I can see your point nscd can not do other types
>>>> of records.
>>>>
>>>>
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: editing etc hosts files of a cluster

Steve Loughran
In reply to this post by Allen Wittenauer
Allen Wittenauer wrote:
> A bit more specific:
>
> At Yahoo!, we had either every server as a DNS slave or a DNS caching
> server.  
>
> In the case of LinkedIn, we're running Solaris so nscd is significantly
> better than its Linux counterpart.  However, we still seem to be blowing out
> the cache too much.  So we'll likely switch to DNS caching servers here as
> well.

the standard hadoop scripts don't tune DNS caching in the JVM, so Hadoop
doesn't notice DNS entries changing; that adds extra complexity to the
DNS-lookup-failure class of bugs -the situation where the TT and forked
jobs see different IP addresses for the same hosts
Reply | Threaded
Open this post in threaded view
|

Re: editing etc hosts files of a cluster

Steve Loughran
In reply to this post by Allen Wittenauer
David B. Ritch wrote:

> Most of the communication and name lookups within a cluster refer to
> other nodes within that same cluster.  It is usually not a big deal to
> put all the systems from a cluster in a single hosts file, and rsync it
> around the cluster.  (Consider using prsync, which comes with pssh,
> http://www.theether.org/pssh/, or your favorite cluster management
> software.)
> Editing each individually clearly doesn't scale; but editing it once and
> replicating it does.
>
> Is a large hosts file less efficient than nscd or a caching DNS server
> for nodes within the cluster?
>

Pro
  * removes the DNS server as a SPOF
  * works on clusters without DNS servers (virtual ones, for example)
  * lets you set up private hostnames ("namenode", "jobtracker") that
don't change
  * lets you keep the cluster config under SCM

Con
  * harder to push out changes
  * wierd errors when your cluster is inconsistent


We could do a lot in Hadoop in detecting and reporting DNS problems;
contributions here would be very welcome. They are a dog to test though.