Hi Garth,

Thanks for taking the time to reply!

On Wed, Mar 6, 2013 at 6:39 PM, Garthy D
<garthy_lmkltybr / entropicsoftware.com> wrote:
>
> Hi Timur,
>
> If you're working with DNS you're going to run into all sorts of caching fun
> in general, so it's a good idea to be prepared for it. Applications (eg.
> browsers), operating systems, ISPs, and nameservers all do their own caching
> of DNS results. Did you know, for example, that the data could be over a
> week out-of-date because you are using an ISP that disregards TTLs, and your
> browser has been caching previous results to remain responsive?

Yes, I do realize that propagating updates can take a significant
amount of time. However, eventually, they should be propagated -- the
behavior you're describing sounds like a bug.

> The general rationale is that it is expensive to always have the latest
> information available- in terms of amount of data, or just the time taken to
> parse /etc/hosts, or just the initial wait in making the request. Resolver
> libraries are usually called very frequently with exactly the same data. To
> keep libraries responsive, usually the first request of a sort is made, and
> the resolver waits for a response. The next similar requests use cached
> results, rather than waiting for a response each time.

Agreed, however, the cache should eventually be invalidated if there
is an update to ensure correctness.

> I can't speak for the Resolv class myself- I've never used it- but the
> rationale is probably similar. If you've got a hard requirement such as
> checking /etc/hosts for changes, you might want to consider a wrapper or
> proxy class that watches /etc/hosts for changes in its timestamp, and then
> does whatever you need- restarting OS-level resolvers, perhaps dropping and
> recreating the Resolv object, or somehow forcing it to drop its cache.
> Exactly what needs to be done will depend on the problem you are trying to
> solve.

As I mentioned, one workaround is to call getaddrinfo, since glibc
does not cache getaddrinfo responses.

However, I was trying to point out the larger issue here: the Resolv
class (and the Dnsruby gem), which expose these operations through
class methods, would never recover in case of /etc/hosts being
updated. This is not a delay in propagation, but actually incorrect
behavior -- through looking at the source, I didn't see anything that
would cause the cache to be either invalidated or updated. That in
itself appears like a bug, so I wanted to see if this was a conscious
decision and whether there are any plans to address it. Maybe "talk"
is not the best venue for it?

You are correct, however, to point out that one can work around it by
reloading the class on every query, but that seems like overkill?

> Hope this helps. :)
>
> Cheers,
> Garth
>
>
> On 07/03/13 09:11, Timur Alperovich wrote:
>>
>> Hey guys,
>>
>> I ran into a subtle bug recently where I did not realize that the
>> Resolv class actually caches the contents of the /etc/hosts file. This
>> is problematic, as if the resolution is done in a long-running
>> process, any changes are not visible to the process unless the class
>> is reloaded or the process is restarted. I was wondering if anyone
>> knows what the rationale is for caching the contents and why there are
>> no checks to see if the file has been modified? For now, I've worked
>> around this by calling Socket::getaddrinfo, as getaddrinfo does the
>> right thing.

-- 
Cheers,
Timur