Syed Jahanzaib Personnel Blog to Share Knowledge !

January 11, 2012

Howto Cache Youtube with SQUID / LUSCA and bypass Cached Videos from Mikrotik Queue


UPDATED: 10th MAY, 2012

BAD NEWS:

.

.

Some research update (Someone please confirm and post your comments)
YOUTUBE has split its videos into segments of 1.5 Mb each which is the approximation of the 51 seconds. I am sure YOUTUBE have taken this step to prevent people from caching  entire videos. If you have a video which is 100 Mb large, it will be split into about 55-60 segments.
As of right now, storeurl.pl wont be able to cache it. Currently VIDEOCACHE plugin is

doing full cache of youtube but at higher 400 $ Per Year :)

Following is an update version of youtube caching

http://aacable.wordpress.com/2012/01/19/youtube-caching-with-squid-2-7-using-storeurl-pl/

.

http://aacable.wordpress.com/2012/08/13/youtube-caching-with-squid-nginx/

.

.

.

 

What is LUSCA/SQUID ?

LUSCA is an advance version or Fork of  SQUID 2. The Lusca project aims to fix the shortcomings in the Squid-2. It also supports a variety of clustering protocols. By Using it, you can cache some dynamic contents that you previously can’t do with the squid.

For example
#  Video Cachingi.e Youtube / tube etc . . .
#  Windows / Linux Updates / Anti-virus , Anti-Malware i.e. Avira/ Avast / MBAM etc . . .
#  Well known sites i.e. facebook / google / yahoo etch. etch.
#  Download caching mp3′s/mpeg/avi etc . . .

Advantages of Youtube Caching   !!!

In most part of the world, bandwidth is very expensive, therefore it is (in some scenarios) very useful to Cache Youtube videos or any other flash videos, so if one of user downloads video / flash file , why again the same user or other user can’t download the same file from the CACHE, why he sucking the internet pipe for same content again n again?
Peoples on same LAN ,sometimes watch similar videos. If I put some youtube video link on on FACEBOOK, TWITTER or likewise , and all my friend will  watch that video and that particular video gets viewed many times in few hours. Usually the videos are shared over facebook or other social networking sites so the chances are high for multiple hits per popular videos for my lan users / friends.

This is the reason why I wrote this article. I have implemented Ubuntu with LUSCA/ Squid on it and its working great, but to achieve some results you need to have some TB of storage drives in your proxy machine.

Disadvantages of Youtube Caching   !!!

The chances, that another user will watch the same video, is really slim. if I search for something specific on youtube, i get more then hundreds of search results for same video. What is the chance that another user will search for the same thing, and will click on the same link / result? Youtube hosts more than 10 million videos. Which is too much to cache anyway. You need lot of space to cache videos. Also accordingly you will be needing ultra modern fast hardware with tons of RAM to handle such kind of cache giant. anyhow Try it

We will divide this article in following Sections

1#  Installing SQUID / LUSCA in UBUNTU
2#  Setting up SQUID / LUSCA Configuration files
3#  Performing some Tests, testing your Cache HIT
4# Using ZPH TOS to deliver cached contents to clients vai mikrotik at full LAN speed, Bypassing the User Queue for cached contents.

1#  Installing SQUID / LUSCA in UBUNTU

I assume your ubuntu box have 2 interfaces configured, one for LAN and second for WAN. You have internet sharing already configured. Now moving on to LUSCA / SQUID installation.

Download LUSCA source and compile it using,

mkdir /temp
cd /temp
wget http://lusca-cache.googlecode.com/files/LUSCA_HEAD-r14809.tar.gz
tar xzvf LUSCA_HEAD-r14809.tar.gz

Update & Install some tools for ubuntu

sudo apt-get update
sudo apt-get install gcc build-essential sharutils ccze libzip-dev automake1.9
cd LUSCA_HEAD-r14809

Now compile LUSCA with following options

./configure '--prefix=/usr/local/squid' '--enable-removal-policies=heap,lru' '--disable-dependency-tracking' '--disable-arp-acl' '--disable-cache-digests' '--enable-cachemgr-hostname=localhost' '--disable-delay-pools' '--enable-epoll' '--enable-external-acl-helpers=ip_user' '--disable-ident-lookups' '--enable-linux-netfilter' '--disable-referer-log' '--enable-removal-policies=heap,lru' '--disable-snmp' '--disable-ssl' '--enable-storeio=aufs,null,coss' '--disable-useragent-log' '--disable-wccpv2' '--with-aio' '--with-maxfd=1048576' '--with-dl' '--with-pthreads' 'build_alias=i686-redhat-linux-gnu' 'host_alias=i686-redhat-linux-gnu' 'targe_alias=i686-redhat-linux-gnu''--enable-truncate' '--disable-unlinkd' '--with-large-files' '--disable-htcp'
sudo make all
sudo make install

All of Lusca/Squid configuration files can be found at

/usr/local/squid/etc/
and squid executable can be found at
/usr/local/squid/sbin/

Now We will edit squid.conf file to make it customize according to our requirements by . . .

nano /usr/local/squid/etc/squid.conf

Delete all previously lines , and paste the following lines.

Remember following squid.conf is not very neat and clean , you will find many un necessary junk entries in it, but as I didn’t had time to clean them all, so you may clean them as per your targets and goals.

# SQUID 2.7/ LUSCA TEST CONFIG FILE
# Email: aacable@hotmail.com
# Web  : http://aacable.wordpress.com

# PORT and Transparent Option
http_port 8080 transparent
server_http11 on
icp_port 0

# Cache Directory , modify it according to your system.
# but first create directory in root by mkdir /cache1
# and then issue this command  chown proxy:proxy /cache1
# [for ubuntu user is proxy, in Fedora user is SQUID]
# I have set 500 MB for caching reserved just for caching ,
# adjust it according to your need.
# My recommendation is to have one cache_dir per drive. zzz

#store_dir_select_algorithm round-robin
cache_dir aufs /cache1 500 16 256
cache_replacement_policy heap LFUDA
memory_replacement_policy heap LFUDA

# If you want to enable DATE time n SQUID Logs,use following
emulate_httpd_log on
logformat squid %tl %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt
log_fqdn off

# How much days to keep users access web logs
# You need to rotate your log files with a cron job. For example:
# 0 0 * * * /usr/local/squid/bin/squid -k rotate
logfile_rotate 14
debug_options ALL,1
cache_access_log /var/log/squid/access.log
cache_log /var/log/squid/cache.log
cache_store_log /var/log/squid/store.log

#I used DNSAMSQ service for fast dns resolving
#so install by using "apt-get install dnsmasq" first
dns_nameservers 127.0.0.1 221.132.112.8
ftp_user anonymous@
ftp_list_width 32
ftp_passive on
ftp_sanitycheck on

#ACL Section
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
acl SSL_ports port 443 563 # https, snews
acl SSL_ports port 873 # rsync
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 563 # https, snews
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl Safe_ports port 631 # cups
acl Safe_ports port 873 # rsync
acl Safe_ports port 901 # SWAT
acl purge method PURGE
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost
http_access allow all
http_reply_access allow all
icp_access allow all

#==========================
# Administrative Parameters
#==========================

# I used UBUNTU so user is proxy, in FEDORA you may use use squid
cache_effective_user proxy
cache_effective_group proxy
cache_mgr aacable@hotmail.com
visible_hostname proxy.aacable.net
unique_hostname aacable@hotmail.com

#=============
# ACCELERATOR
#=============
half_closed_clients off
quick_abort_min 0 KB
quick_abort_max 0 KB
vary_ignore_expire on
reload_into_ims on
log_fqdn off
memory_pools off
cache_swap_low 98
cache_swap_high 99
max_filedescriptors 65536
fqdncache_size 16384
retry_on_error on
offline_mode off
pipeline_prefetch on
# If you want to hide your proxy machine from being detected at various site use following
via off

#============================================
# OPTIONS WHICH AFFECT THE CACHE SIZE / zaib
#============================================
# If you have 4GB memory in Squid box, we will use formula of 1/3
# You can adjust it according to your need. IF squid is taking too much of RAM
# Then decrease it to 128 MB or even less.

cache_mem 8 MB
minimum_object_size 0 bytes
maximum_object_size 100 MB
maximum_object_size_in_memory 128 KB

#============================================================$
# SNMP , if you want to generate graphs for SQUID via MRTG
#============================================================$
#acl snmppublic snmp_community gl
#snmp_port 3401
#snmp_access allow snmppublic all
#snmp_access allow all

#============================================================
# ZPH , To enable cache content to be delivered at full lan speed,
# To bypass the queue at MT.
#============================================================
tcp_outgoing_tos 0x30 all
zph_mode tos
zph_local 0x30
zph_parent 0
zph_option 136

# Caching Youtube
acl videocache_allow_url url_regex -i \.youtube\.com\/get_video\?
acl videocache_allow_url url_regex -i \.youtube\.com\/videoplayback \.youtube\.com\/videoplay \.youtube\.com\/get_video\?
acl videocache_allow_url url_regex -i \.youtube\.[a-z][a-z]\/videoplayback \.youtube\.[a-z][a-z]\/videoplay \.youtube\.[a-z][a-z]\/get_video\?
acl videocache_allow_url url_regex -i \.googlevideo\.com\/videoplayback \.googlevideo\.com\/videoplay \.googlevideo\.com\/get_video\?
acl videocache_allow_url url_regex -i \.google\.com\/videoplayback \.google\.com\/videoplay \.google\.com\/get_video\?
acl videocache_allow_url url_regex -i \.google\.[a-z][a-z]\/videoplayback \.google\.[a-z][a-z]\/videoplay \.google\.[a-z][a-z]\/get_video\?
acl videocache_allow_url url_regex -i proxy[a-z0-9\-][a-z0-9][a-z0-9][a-z0-9]?\.dailymotion\.com\/
acl videocache_allow_url url_regex -i vid\.akm\.dailymotion\.com\/
acl videocache_allow_url url_regex -i [a-z0-9][0-9a-z][0-9a-z]?[0-9a-z]?[0-9a-z]?\.xtube\.com\/(.*)flv
acl videocache_allow_url url_regex -i \.vimeo\.com\/(.*)\.(flv|mp4)
acl videocache_allow_url url_regex -i va\.wrzuta\.pl\/wa[0-9][0-9][0-9][0-9]?
acl videocache_allow_url url_regex -i \.youporn\.com\/(.*)\.flv
acl videocache_allow_url url_regex -i \.msn\.com\.edgesuite\.net\/(.*)\.flv
acl videocache_allow_url url_regex -i \.tube8\.com\/(.*)\.(flv|3gp)
acl videocache_allow_url url_regex -i \.mais\.uol\.com\.br\/(.*)\.flv
acl videocache_allow_url url_regex -i \.blip\.tv\/(.*)\.(flv|avi|mov|mp3|m4v|mp4|wmv|rm|ram|m4v)
acl videocache_allow_url url_regex -i \.apniisp\.com\/(.*)\.(flv|avi|mov|mp3|m4v|mp4|wmv|rm|ram|m4v)
acl videocache_allow_url url_regex -i \.break\.com\/(.*)\.(flv|mp4)
acl videocache_allow_url url_regex -i redtube\.com\/(.*)\.flv
acl videocache_allow_dom dstdomain .mccont.com .metacafe.com .cdn.dailymotion.com
acl videocache_deny_dom  dstdomain .download.youporn.com .static.blip.tv
acl dontrewrite url_regex redbot\.org \.php
acl getmethod method GET

storeurl_access deny dontrewrite
storeurl_access deny !getmethod
storeurl_access deny videocache_deny_dom
storeurl_access allow videocache_allow_url
storeurl_access allow videocache_allow_dom
storeurl_access deny all

storeurl_rewrite_program /etc/squid/storeurl.pl
storeurl_rewrite_children 7
storeurl_rewrite_concurrency 100

acl store_rewrite_list urlpath_regex -i \/(get_video\?|videodownload\?|videoplayback.*id)
acl store_rewrite_list urlpath_regex -i \.flv$ \.mp3$ \.mp4$ \.swf$ \
storeurl_access allow store_rewrite_list
storeurl_access deny all

refresh_pattern -i \.flv$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.mp3$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.mp4$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.swf$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.gif$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.jpg$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.jpeg$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private  ignore-auth
refresh_pattern -i \.exe$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private  ignore-auth

# 1 year = 525600 mins, 1 month = 10080 mins, 1 day = 1440
refresh_pattern (get_video\?|videoplayback\?|videodownload\?|\.flv?)    10080 80% 10080 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern (get_video\?|videoplayback\?id|videoplayback.*id|videodownload\?|\.flv?)    10080 80% 10080 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern \.(ico|video-stats) 10080 80% 10080 override-expire ignore-reload ignore-no-cache  ignore-private ignore-auth override-lastmod  negative-ttl=10080
refresh_pattern \.etology\?                                     10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern galleries\.video(\?|sz)                         10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern brazzers\?                                      10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern \.adtology\?                                    10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern ^.*(utm\.gif|ads\?|rmxads\.com|ad\.z5x\.net|bh\.contextweb\.com|bstats\.adbrite\.com|a1\.interclick\.com|ad\.trafficmp\.com|ads\.cubics\.com|ad\.xtendmedia\.com|\.googlesyndication\.com|advertising\.com|yieldmanager|game-advertising\.com|pixel\.quantserve\.com|adperium\.com|doubleclick\.net|adserving\.cpxinteractive\.com|syndication\.com|media.fastclick.net).* 10080 20% 10080 ignore-no-cache  ignore-private override-expire ignore-reload ignore-auth   negative-ttl=40320 max-stale=10
refresh_pattern ^.*safebrowsing.*google  10080 80% 10080 override-expire ignore-reload ignore-no-cache ignore-private ignore-auth  negative-ttl=10080
refresh_pattern ^http://((cbk|mt|khm|mlt)[0-9]?)\.google\.co(m|\.uk)    10080 80% 10080 override-expire ignore-reload ignore-private  negative-ttl=10080
refresh_pattern ytimg\.com.*\.jpg                                       10080 80% 10080 override-expire ignore-reload
refresh_pattern images\.friendster\.com.*\.(png|gif)                    10080 80% 10080 override-expire ignore-reload
refresh_pattern garena\.com                                             10080 80% 10080 override-expire reload-into-ims
refresh_pattern photobucket.*\.(jp(e?g|e|2)|tiff?|bmp|gif|png)          10080 80% 10080 override-expire ignore-reload
refresh_pattern vid\.akm\.dailymotion\.com.*\.on2\?                     10080 80% 10080 ignore-no-cache override-expire override-lastmod
refresh_pattern mediafire.com\/images.*\.(jp(e?g|e|2)|tiff?|bmp|gif|png)    10080 80% 10080 reload-into-ims override-expire ignore-private
refresh_pattern ^http:\/\/images|pics|thumbs[0-9]\.                     10080 80% 10080 reload-into-ims ignore-no-cache  ignore-reload override-expire
refresh_pattern ^http:\/\/www.onemanga.com.*\/                          10080 80% 10080 reload-into-ims ignore-no-cache  ignore-reload override-expire
refresh_pattern ^http://v\.okezone\.com/get_video\/([a-zA-Z0-9]) 10080 80% 10080 override-expire ignore-reload ignore-no-cache  ignore-private ignore-auth override-lastmod  negative-ttl=10080

#images facebook
refresh_pattern -i \.facebook.com.*\.(jpg|png|gif)                      10080 80% 10080 ignore-reload override-expire ignore-no-cache
refresh_pattern -i \.fbcdn.net.*\.(jpg|gif|png|swf|mp3)                 10080 80% 10080 ignore-reload override-expire ignore-no-cache
refresh_pattern  static\.ak\.fbcdn\.net*\.(jpg|gif|png)                 10080 80% 10080 ignore-reload override-expire ignore-no-cache
refresh_pattern ^http:\/\/profile\.ak\.fbcdn.net*\.(jpg|gif|png)        10080 80% 10080 ignore-reload override-expire ignore-no-cache

#All File
refresh_pattern -i \.(3gp|7z|ace|asx|bin|deb|divx|dvr-ms|ram|rpm|exe|inc|cab|qt)       10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(rar|jar|gz|tgz|bz2|iso|m1v|m2(v|p)|mo(d|v)|arj|lha|lzh|zip|tar)  10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(jp(e?g|e|2)|gif|pn[pg]|bm?|tiff?|ico|swf|dat|ad|txt|dll)         10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(avi|ac4|mp(e?g|a|e|1|2|3|4)|mk(a|v)|ms(i|u|p)|og(x|v|a|g)|rm|r(a|p)m|snd|vob) 10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(pp(t?x)|s|t)|pdf|rtf|wax|wm(a|v)|wmx|wpl|cb(r|z|t)|xl(s?x)|do(c?x)|flv|x-flv) 10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims

refresh_pattern -i (/cgi-bin/|\?)  0  0%  0
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern ^ftp:           10080   95% 10080 override-lastmod reload-into-ims
refresh_pattern .               1440    95% 10080 override-lastmod reload-into-ims

Now create cache dir and logs files , and assign them necessary permissions.

mkdir /cache1
chown proxy:proxy /cache1
mkdir /var/log/squid
chmod 777 /var/log/squid

Now initialize cache dir by

/usr/local/squid/sbin/squid -z

SOTEURL.PL

Now We have to create an important file name storeurl.pl , which is very important and actually it does the
main job to pull video from cache.

mkdir /etc/squid
touch /etc/squid/storeurl.pl
chmod +x /etc/squid/storeurl.pl
nano /etc/squid/storeurl.pl

Now paste the following lines, then Save and exit.

#!/usr/bin/perl
# This script is NOT written or modified by me, I only copy pasted it from the internet.
# It was First originally Writen by chudy_fernandez@yahoo.com
# & Have been modified by various persons over the net to fix/add various functions.
# For Example this ver was modified by member of comstuff.net to satisfy common and dynamic content.
# th30nly @comstuff.net a.k.a invisible_theater ,
# For more info, http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube
$|=1;
while (<>) {
@X = split;
#       $X[1] =~ s/&sig=.*//;
$x = $X[0] . " ";
$_ = $X[1];
$u = $X[1];

#speedtest
if (m/^http:\/\/(.*)\/speedtest\/(.*\.(jpg|txt))\?(.*)/) {
print $x . "http://www.speedtest.net.SQUIDINTERNAL/speedtest/" . $2 . "\n";

#mediafire
}elsif (m/^http:\/\/199\.91\.15\d\.\d*\/\w{12}\/(\w*)\/(.*)/) {
print $x . "http://www.mediafire.com.SQUIDINTERNAL/" . $1 ."/" . $2 . "\n";

#fileserve
}elsif (m/^http:\/\/fs\w*\.fileserve\.com\/file\/(\w*)\/[\w-]*\.\/(.*)/) {
print $x . "http://www.fileserve.com.SQUIDINTERNAL/" . $1 . "./" . $2 . "\n";

#filesonic
}elsif (m/^http:\/\/s[0-9]*\.filesonic\.com\/download\/([0-9]*)\/(.*)/) {
print $x . "http://www.filesonic.com.SQUIDINTERNAL/" . $1 . "\n";

#4shared
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/download\/(.*)\/(.*\..*)\?.*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/download/$2\/$3\n";

#4shared preview
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/img\/(\d*)\/\w*\/dlink__2Fdownload_2F(\w*)_3Ftsid_3D[\w-]*\/preview\.mp3\?sId=\w*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/$2\n";

#photos-X.ak.fbcdn.net where X a-z
}elsif (m/^http:\/\/photos-[a-z](\.ak\.fbcdn\.net)(\/.*\/)(.*\.jpg)/) {
print $x . "http://photos" . $1 . "/" . $2 . $3  . "\n";

#YX.sphotos.ak.fbcdn.net where X 1-9, Y a-z
} elsif (m/^http:\/\/[a-z][0-9]\.sphotos\.ak\.fbcdn\.net\/(.*)\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  ."/". $2 . "\n";

#maps.google.com
} elsif (m/^http:\/\/(cbk|mt|khm|mlt|tbn)[0-9]?(.google\.co(m|\.uk|\.id).*)/) {
print $x . "http://" . $1  . $2 . "\n";

# compatibility for old cached get_video?video_id
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?(videoplayback\?id=.*?|video_id=.*?)\&(.*?)/) {
$z = $2; $z =~ s/video_id=/get_video?video_id=/;
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $z . "\n";

# youtube fix
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/videoplayback\?(.*)/) {
$p_str = $2;
$tag = "";
$alg = "";
$id = "";
$range = "";
if ($p_str =~ m/(itag=[0-9]*)/){$tag = "&".$1}
if ($p_str =~ m/(algorithm=[a-z]*\-[a-z]*)/){$alg = "&".$1}
if ($p_str =~ m/(id=[a-zA-Z0-9]*)/){$id = "&".$1}
if ($p_str =~ m/(range=[0-9\-]*)/){$range = "&".$1; $range =~ s/-//; $range =~ s/range=//; }
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $tag . "&" . $alg . "&" . $id . "&" . $range . "\n";

} elsif (m/^http:\/\/www\.google-analytics\.com\/__utm\.gif\?.*/) {
print $x . "http://www.google-analytics.com/__utm.gif\n";

#Cache High Latency Ads
} elsif (m/^http:\/\/([a-z0-9.]*)(\.doubleclick\.net|\.quantserve\.com|\.googlesyndication\.com|yieldmanager|cpxinteractive)(.*)/) {
$y = $3;$z = $2;
for ($y) {
s/pixel;.*/pixel/;
s/activity;.*/activity/;
s/(imgad[^&]*).*/\1/;
s/;ord=[?0-9]*//;
s/;&timestamp=[0-9]*//;
s/[&?]correlator=[0-9]*//;
s/&cookie=[^&]*//;
s/&ga_hid=[^&]*//;
s/&ga_vid=[^&]*//;
s/&ga_sid=[^&]*//;
# s/&prev_slotnames=[^&]*//
# s/&u_his=[^&]*//;
s/&dt=[^&]*//;
s/&dtd=[^&]*//;
s/&lmt=[^&]*//;
s/(&alternate_ad_url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&ref=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&cookie=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/[;&?]ord=[?0-9]*//;
s/[;&]mpvid=[^&;]*//;
s/&xpc=[^&]*//;
# yieldmanager
s/\?clickTag=[^&]*//;
s/&u=[^&]*//;
s/&slotname=[^&]*//;
s/&page_slots=[^&]*//;
}
print $x . "http://" . $1 . $2 . $y . "\n";

#cache high latency ads
} elsif (m/^http:\/\/(.*?)\/(ads)\?(.*?)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

# spicific servers starts here....
} elsif (m/^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*?)/) {
print $x . "http://" . $1 . "\n";

#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) && (m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#rapidshare
} elsif (($u =~ /rapidshare/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)([a-z]*\.[^\/]{3}\/[a-z]*\/[0-9]*)\/(.*?)\/([^\/\?\&]{4,})$/)) {
print $x . "http://cdn." . $3 . "/SQUIDINTERNAL/" . $5 . "\n";

} elsif (($u =~ /maxporn/) && (m/^http:\/\/([^\/]*?)\/(.*?)\/([^\/]*?)(\?.*)?$/)) {
print $x . "http://" . $1 . "/SQUIDINTERNAL/" . $3 . "\n";

#like porn hub variables url and center part of the path, filename etention 3 or 4 with or without ? at the end
} elsif (($u =~ /tube8|pornhub|xvideos/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?(\.[a-z]*)?)\.([a-z]*[0-9]?\.[^\/]{3}\/[a-z]*)(.*?)((\/[a-z]*)?(\/[^\/]*){4}\.[^\/\?]{3,4})(\?.*)?$/)) {
print $x . "http://cdn." . $4 . $6 . "\n";
#...spicific servers end here.

#photos-X.ak.fbcdn.net where X a-z
} elsif (m/^http:\/\/photos-[a-z].ak.fbcdn.net\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  . "\n";

#for yimg.com video
} elsif (m/^http:\/\/(.*yimg.com)\/\/(.*)\/([^\/\?\&]*\/[^\/\?\&]*\.[^\/\?\&]{3,4})(\?.*)?$/) {
print $x . "http://cdn.yimg.com//" . $3 . "\n";

#for yimg.com doubled
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*?)\.yimg\.com\/(.*?)\?(.*)/) {
print $x . "http://cdn.yimg.com/"  . $3 . "\n";

#for yimg.com with &sig=
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*)/) {
@y = ($1,$2);
$y[0] =~ s/[a-z]+[0-9]+/cdn/;
$y[1] =~ s/&sig=.*//;
print $x . "http://" . $y[0] . ".yimg.com/"  . $y[1] . "\n";

#youjizz. We use only domain and filename
} elsif (($u =~ /media[0-9]{2,5}\.youjizz/) && (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\/([^\/\?\&]*)\.([^\/\?\&]{3,4})((\?|\%).*)?$/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#general purpose for cdn servers. add above your specific servers.
} elsif (m/^http:\/\/([0-9.]*?)\/\/(.*?)\.(.*)\?(.*?)/) {
print $x . "http://squid-cdn-url//" . $2  . "." . $3 . "\n";

#generic http://variable.domain.com/path/filename."ex" "ext" or "exte" with or withour "? or %"
} elsif (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\.([^\/\?\&]{2,4})((\?|\%).*)?$/) {
@y = ($1,$2,$3,$4);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

# generic http://variable.domain.com/...
} elsif (m/^http:\/\/(([A-Za-z]+[0-9-]+)*?|.*cdn.*|.*cache.*)\.(.*?)\.(.*?)\/(.*)$/) {
print $x . "http://cdn." . $3 . "." . $4 . "/" . $5 .  "\n";

# spicific extention that ends with ?
} elsif (m/^http:\/\/(.*?)\/(.*?)\.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|on2)(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "." . $3 . "\n";

# all that ends with ;
} elsif (m/^http:\/\/(.*?)\/(.*?)\;(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

} else {
print $x . $_ . "sucks\n";
}
}

Now start SQUIDServer by

/usr/local/squid/sbin/squid

TIP:
To start SQUID Server in Debug mode, to check any erros, use

/usr/local/squid/sbin/squid -d1

3# TESTING YOUR CACHE HIT

It’s time to hit the ROAD and do performing some tests.

YOUTUBE TEST

Open Youtube and watch any Video. After complete download, Check the same video from another client. You will notice that it download very quickly , you can watch the bar moving fast.
As Shown in the example Below . . .

monitor the Squid access LOG. You will see cache hit TPC_HIT for this video.
As Shown in the example Below . . .

MUSIC DOWNLOAD TEST

Now test any music download. For example Go to
http://www.apniisp.com/songs/indian-movie-songs/ladies-vs-ricky-bahl/690/1.html
As Shown in the example Below . . .

and download any song , after its downloaded, goto 2nd client pc, and download the same song, and monitor the Squid access LOG. You will see cache hit TPC_HIT for this song.

As Shown in the example Below . . .

EXE / PROGRAM  DOWNLOAD TEST

Now test any .exe file download.
Goto http://www.rarlabs.com and download any package. After Download completes, goto 2nd client pc , and download the same file again. and monitor the Squid access LOG. You will see cache hit TPC_HIT for this file.

As Shown in the example Below . . .

SQUID LOGS

If you face “An error occured” in cached videos , see the following

http://aacable.wordpress.com/2012/01/30/youtube-caching-problem-an-error-occured-please-try-again-later-solved/

Regard’s
SYED JAHANZAIB
http://aacable.wordpress.com

The Silver is the New Black Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 2,047 other followers