On Thu, Jul 5, 2012 at 4:13 AM, Eliezer Croitoru <eliezer / ngtech.co.il> wrote:
> thanks in advance i need a bit help to break the ice that my head is in.
>
> i am working on ICAP server and i am building some acls for content
> filtering.

Ah, interesting!

> i have a list of domains and i want to apply acl such as ".example.com" will
> match all domains that starts with "example.com".
> but the problem is that domain start in a reverse order.
> i was thinking on what is the best\better way to match a domain to domain
> list ?

If in memory (which does not seem to be the case here) I would create
a forest for matching with TLD's as root level:

com
+- example
+- google

> i have a sql(PGSQL\MYSQL) db that contains the list of domains.
> so i was thinking to use "divide and Conquer way" to dissemble the requested
> domain such as "subporndomain.example.com" to
> ["com","example","subporndomain"] and then to fetch from the db all the
> domains that starts with "example.com" and then check match each and one of
> them(if any record exists) to the requested domain.

That won't work since if you have a domain "foo.example.com" using
"example.com" as prefix won't retrieve it.

Since you have a relational database, I would store domains in reverse
order ("example.com" -> "com.example"), create an index on that column
and use LIKE operator, for example:

-- just a matching test
SELECT MIN(rev_domain)
WHERE rev_domain = 'com.example'
OR rev_domain LIKE 'com.example.%'

SELECT rev_domain
WHERE rev_domain = 'com.example'
OR rev_domain LIKE 'com.example.%'
LIMIT 1

-- all
SELECT domain
WHERE rev_domain = 'com.example'
OR rev_domain LIKE 'com.example.%'


You might be able to write a procedure to convert regular
representation to reverse representation, make that a calculated
column using the procedure and create an index on that calculated
column.  Alternatively calculate the value via a trigger.  Then your
insertion and update logic can stay as is.

> i was thinking of using the "string".start_with?
> method in reverse such as
> "moc.elpmaxe.niamodnropbus".start_with?(["moc.elpmaxe", ...])
> with the list of domains as a prefix..
> but there is one problem: i cant match all the domains as prefix because
> some of them are not a prefix but an exact match.
> so i will need to do some code like this:
>
> dom = "subdomain.example.com"
> domacl = ".example.com"
> domacl1 = "example.com"
>
> def match?(domacl,dom)
>         if domacl.start_with?(".")
>            return  ( dom.reverse.start_with?(domacl.reverse.chop) )
>         else
>            return true if dom.eql?(domacl)
>            return false
>         end
> end

You can make that simpler and I guess also a tad more efficient:

def match? domacl, dom
  dom == domacl ||
    domacl[0] == '.' && dom.reverse.start_with?(domacl.reverse.chop)
end

Note: this is not exactly identical from a logical point of view - I
just assume that you do not have a dom which starts with ".".

> so it works..

:-)

> my goal is maximum efficiency.

Yes, but you should not loose sight of functionality.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/