Child Safe Domain Analyzer

This came up while we had to process millions of domains and from those we had to identify if a domain is child safe or not. Millions of domains check every content every tags check everything else was quiet not feasible. So just a basic assumption checking parameters like page titles, tags meta tags etc to some extent can conclude if any site is child safe or not. Again another problem came up the question was if the domain isn’t in english how can we determine if it is in fact a vulnerable pornographic site? So we decided to use python scrape a domain grab its titles and tags and translate it using microsoft’s Translator API since it provides 2 million characters a month compared to googles 1million characters for 10$.

Didn’t know microsofts translation has been so accurate and its like for free since 2million a month was a good offer for developers like us.

Continue reading “Child Safe Domain Analyzer”

Priority queue

A priority queue is a collection of elements such that each element has been assigned a priority and the order in which elements are deleted and processed comes from the following rules:

  • An element of higher priority is processed before any element of lower priority.
  • If two elements has same priority then they are processed according to the order in which they were added to the queue.

The best application of priority queue is observed in CPU scheduling.

  • The jobs which have higher priority are processed first.
  • If the priority of two jobs is same this jobs are processed according to their
    position in queue.
  • A short job is given higher priority over the longer one.

Continue reading “Priority queue”

Statistics Crash Course using R studio

Just found some nice and easy three day crash course which will be an easy task to pick up Statistics with the help of some programming too using useful statistical tool called R. Rstudio being an IDE which has fine UI and easy utility.

STATISTICS CRASH COURSE DAY 1