Syed Jahanzaib Personnel Blog to Share Knowledge !

January 19, 2012

Youtube caching with SQUID 2.7 [using storeurl.pl]

Filed under: Linux Related — Tags: , — Syed Jahanzaib / Pinochio~:) @ 11:34 AM

 UPDATED: 10th MAY, 2012

BAD NEWS:

Some research update (Someone Please confirm & post your comments)

YOUTUBE has split its videos into segments of 1.7 Mb each which is the approximation of the 51 seconds. I am sure YOUTUBE have taken this step to prevent people from caching  entire videos, [In my opinion its good to prevent squid from filling up squickly]. If you are watching a a video which is 100 Mb large, it will be split into about 55-60 segments.
As of right now, storeurl.pl wont be able to cache entire video, only first 51 seconds will be cached.
As of right now VIDEOCACHE plugin is doing full cache of youtube videos but at higher 400 $ Per Year price :)

I will update if I found out any solution to it. Currently I am busy in my new job, so its very hard to manage time for this.

UPDATED: 2nd April, 2012

This is a Shorten Version of  http://aacable.wordpress.com/tag/aacable-howto-cache-youtube/

This is a quick reference guide for SQUID 2.7 installation on Ubuntu Desktop ver 10.4 with youtube caching supported. Make sure you have setup proper internet connection in Ubuntu BOX.

Install SQUID by

apt-get install squid

After installation done, Now edit it’s configuration file

nano /etc/squid/squid.conf

remove current lines and paste all squid.conf as follows.

# SQUID 2.7/ LUSCA TEST CONFIG FILE
# Email: aacable@hotmail.com
# Web  : http://aacable.wordpress.com

# PORT and Transparent Option
http_port 8080 transparent
server_http11 on
icp_port 0

# Cache Directory , modify it according to your system.
# but first create directory in root by mkdir /cache1
# and then issue this command  chown proxy:proxy /cache1
# [for ubuntu user is proxy, in Fedora user is SQUID]
# I have set 500 MB for caching reserved just for caching ,
# adjust it according to your need.
# My recommendation is to have one cache_dir per drive. zzz

#store_dir_select_algorithm round-robin
cache_dir aufs /cache1 500 16 256
cache_replacement_policy heap LFUDA
memory_replacement_policy heap LFUDA

# If you want to enable DATE time n SQUID Logs,use following
emulate_httpd_log on
logformat squid %tl %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt
log_fqdn off

# How much days to keep users access web logs
# You need to rotate your log files with a cron job. For example:
# 0 0 * * * /usr/local/squid/bin/squid -k rotate
logfile_rotate 14
debug_options ALL,1
cache_access_log /var/log/squid/access.log
cache_log /var/log/squid/cache.log
cache_store_log /var/log/squid/store.log

#I used DNSAMSQ service for fast dns resolving
#so install by using "apt-get install dnsmasq" first
dns_nameservers 127.0.0.1 221.132.112.8
ftp_user anonymous@
ftp_list_width 32
ftp_passive on
ftp_sanitycheck on

#ACL Section
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
acl SSL_ports port 443 563 # https, snews
acl SSL_ports port 873 # rsync
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 563 # https, snews
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl Safe_ports port 631 # cups
acl Safe_ports port 873 # rsync
acl Safe_ports port 901 # SWAT
acl purge method PURGE
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost
http_access allow all
http_reply_access allow all
icp_access allow all

#==========================
# Administrative Parameters
#==========================

# I used UBUNTU so user is proxy, in FEDORA you may use use squid
cache_effective_user proxy
cache_effective_group proxy
cache_mgr aacable@hotmail.com
visible_hostname proxy.aacable.net
unique_hostname aacable@hotmail.com

#=============
# ACCELERATOR
#=============
half_closed_clients off
quick_abort_min 0 KB
quick_abort_max 0 KB
vary_ignore_expire on
reload_into_ims on
log_fqdn off
memory_pools off
cache_swap_low 98
cache_swap_high 99
max_filedescriptors 65536
fqdncache_size 16384
retry_on_error on
offline_mode off
pipeline_prefetch on
# If you want to hide your proxy machine from being detected at various site use following
via off

#============================================
# OPTIONS WHICH AFFECT THE CACHE SIZE / zaib
#============================================
# If you have 4GB memory in Squid box, we will use formula of 1/3
# You can adjust it according to your need. IF squid is taking too much of RAM
# Then decrease it to 128 MB or even less.

cache_mem 8 MB
minimum_object_size 0 bytes
maximum_object_size 100 MB
maximum_object_size_in_memory 128 KB

#============================================================$
# SNMP , if you want to generate graphs for SQUID via MRTG
#============================================================$
#acl snmppublic snmp_community gl
#snmp_port 3401
#snmp_access allow snmppublic all
#snmp_access allow all

#============================================================
# ZPH , To enable cache content to be delivered at full lan speed,
# To bypass the queue at MT.
#============================================================
tcp_outgoing_tos 0x30 all
zph_mode tos
zph_local 0x30
zph_parent 0
zph_option 136

# Caching Youtube
acl videocache_allow_url url_regex -i \.youtube\.com\/get_video\?
acl videocache_allow_url url_regex -i \.youtube\.com\/videoplayback \.youtube\.com\/videoplay \.youtube\.com\/get_video\?
acl videocache_allow_url url_regex -i \.youtube\.[a-z][a-z]\/videoplayback \.youtube\.[a-z][a-z]\/videoplay \.youtube\.[a-z][a-z]\/get_video\?
acl videocache_allow_url url_regex -i \.googlevideo\.com\/videoplayback \.googlevideo\.com\/videoplay \.googlevideo\.com\/get_video\?
acl videocache_allow_url url_regex -i \.google\.com\/videoplayback \.google\.com\/videoplay \.google\.com\/get_video\?
acl videocache_allow_url url_regex -i \.google\.[a-z][a-z]\/videoplayback \.google\.[a-z][a-z]\/videoplay \.google\.[a-z][a-z]\/get_video\?
acl videocache_allow_url url_regex -i proxy[a-z0-9\-][a-z0-9][a-z0-9][a-z0-9]?\.dailymotion\.com\/
acl videocache_allow_url url_regex -i vid\.akm\.dailymotion\.com\/
acl videocache_allow_url url_regex -i [a-z0-9][0-9a-z][0-9a-z]?[0-9a-z]?[0-9a-z]?\.xtube\.com\/(.*)flv
acl videocache_allow_url url_regex -i \.vimeo\.com\/(.*)\.(flv|mp4)
acl videocache_allow_url url_regex -i va\.wrzuta\.pl\/wa[0-9][0-9][0-9][0-9]?
acl videocache_allow_url url_regex -i \.youporn\.com\/(.*)\.flv
acl videocache_allow_url url_regex -i \.msn\.com\.edgesuite\.net\/(.*)\.flv
acl videocache_allow_url url_regex -i \.tube8\.com\/(.*)\.(flv|3gp)
acl videocache_allow_url url_regex -i \.mais\.uol\.com\.br\/(.*)\.flv
acl videocache_allow_url url_regex -i \.blip\.tv\/(.*)\.(flv|avi|mov|mp3|m4v|mp4|wmv|rm|ram|m4v)
acl videocache_allow_url url_regex -i \.apniisp\.com\/(.*)\.(flv|avi|mov|mp3|m4v|mp4|wmv|rm|ram|m4v)
acl videocache_allow_url url_regex -i \.break\.com\/(.*)\.(flv|mp4)
acl videocache_allow_url url_regex -i redtube\.com\/(.*)\.flv
acl videocache_allow_dom dstdomain .mccont.com .metacafe.com .cdn.dailymotion.com
acl videocache_deny_dom  dstdomain .download.youporn.com .static.blip.tv
acl dontrewrite url_regex redbot\.org \.php
acl getmethod method GET

storeurl_access deny dontrewrite
storeurl_access deny !getmethod
storeurl_access deny videocache_deny_dom
storeurl_access allow videocache_allow_url
storeurl_access allow videocache_allow_dom
storeurl_access deny all

storeurl_rewrite_program /etc/squid/storeurl.pl
storeurl_rewrite_children 7
storeurl_rewrite_concurrency 100

acl store_rewrite_list urlpath_regex -i \/(get_video\?|videodownload\?|videoplayback.*id)
acl store_rewrite_list urlpath_regex -i \.flv$ \.mp3$ \.mp4$ \.swf$ \
storeurl_access allow store_rewrite_list
storeurl_access deny all

refresh_pattern -i \.flv$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.mp3$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.mp4$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.swf$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.gif$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.jpg$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private ignore-auth
refresh_pattern -i \.jpeg$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private  ignore-auth
refresh_pattern -i \.exe$ 10080 80% 10080  override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache  ignore-private  ignore-auth

# 1 year = 525600 mins, 1 month = 10080 mins, 1 day = 1440
refresh_pattern (get_video\?|videoplayback\?|videodownload\?|\.flv?)    10080 80% 10080 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern (get_video\?|videoplayback\?id|videoplayback.*id|videodownload\?|\.flv?)    10080 80% 10080 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern \.(ico|video-stats) 10080 80% 10080 override-expire ignore-reload ignore-no-cache  ignore-private ignore-auth override-lastmod  negative-ttl=10080
refresh_pattern \.etology\?                                     10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern galleries\.video(\?|sz)                         10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern brazzers\?                                      10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern \.adtology\?                                    10080 80% 10080 override-expire ignore-reload ignore-no-cache
refresh_pattern ^.*(utm\.gif|ads\?|rmxads\.com|ad\.z5x\.net|bh\.contextweb\.com|bstats\.adbrite\.com|a1\.interclick\.com|ad\.trafficmp\.com|ads\.cubics\.com|ad\.xtendmedia\.com|\.googlesyndication\.com|advertising\.com|yieldmanager|game-advertising\.com|pixel\.quantserve\.com|adperium\.com|doubleclick\.net|adserving\.cpxinteractive\.com|syndication\.com|media.fastclick.net).* 10080 20% 10080 ignore-no-cache  ignore-private override-expire ignore-reload ignore-auth   negative-ttl=40320 max-stale=10
refresh_pattern ^.*safebrowsing.*google  10080 80% 10080 override-expire ignore-reload ignore-no-cache ignore-private ignore-auth  negative-ttl=10080
refresh_pattern ^http://((cbk|mt|khm|mlt)[0-9]?)\.google\.co(m|\.uk)    10080 80% 10080 override-expire ignore-reload ignore-private  negative-ttl=10080
refresh_pattern ytimg\.com.*\.jpg                                       10080 80% 10080 override-expire ignore-reload
refresh_pattern images\.friendster\.com.*\.(png|gif)                    10080 80% 10080 override-expire ignore-reload
refresh_pattern garena\.com                                             10080 80% 10080 override-expire reload-into-ims
refresh_pattern photobucket.*\.(jp(e?g|e|2)|tiff?|bmp|gif|png)          10080 80% 10080 override-expire ignore-reload
refresh_pattern vid\.akm\.dailymotion\.com.*\.on2\?                     10080 80% 10080 ignore-no-cache override-expire override-lastmod
refresh_pattern mediafire.com\/images.*\.(jp(e?g|e|2)|tiff?|bmp|gif|png)    10080 80% 10080 reload-into-ims override-expire ignore-private
refresh_pattern ^http:\/\/images|pics|thumbs[0-9]\.                     10080 80% 10080 reload-into-ims ignore-no-cache  ignore-reload override-expire
refresh_pattern ^http:\/\/www.onemanga.com.*\/                          10080 80% 10080 reload-into-ims ignore-no-cache  ignore-reload override-expire
refresh_pattern ^http://v\.okezone\.com/get_video\/([a-zA-Z0-9]) 10080 80% 10080 override-expire ignore-reload ignore-no-cache  ignore-private ignore-auth override-lastmod  negative-ttl=10080

#images facebook
refresh_pattern -i \.facebook.com.*\.(jpg|png|gif)                      10080 80% 10080 ignore-reload override-expire ignore-no-cache
refresh_pattern -i \.fbcdn.net.*\.(jpg|gif|png|swf|mp3)                 10080 80% 10080 ignore-reload override-expire ignore-no-cache
refresh_pattern  static\.ak\.fbcdn\.net*\.(jpg|gif|png)                 10080 80% 10080 ignore-reload override-expire ignore-no-cache
refresh_pattern ^http:\/\/profile\.ak\.fbcdn.net*\.(jpg|gif|png)        10080 80% 10080 ignore-reload override-expire ignore-no-cache

#All File
refresh_pattern -i \.(3gp|7z|ace|asx|bin|deb|divx|dvr-ms|ram|rpm|exe|inc|cab|qt)       10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(rar|jar|gz|tgz|bz2|iso|m1v|m2(v|p)|mo(d|v)|arj|lha|lzh|zip|tar)  10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(jp(e?g|e|2)|gif|pn[pg]|bm?|tiff?|ico|swf|dat|ad|txt|dll)         10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(avi|ac4|mp(e?g|a|e|1|2|3|4)|mk(a|v)|ms(i|u|p)|og(x|v|a|g)|rm|r(a|p)m|snd|vob) 10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(pp(t?x)|s|t)|pdf|rtf|wax|wm(a|v)|wmx|wpl|cb(r|z|t)|xl(s?x)|do(c?x)|flv|x-flv) 10080 80% 10080 ignore-no-cache   override-expire override-lastmod reload-into-ims

refresh_pattern -i (/cgi-bin/|\?)  0  0%  0
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern ^ftp:           10080   95% 10080 override-lastmod reload-into-ims
refresh_pattern .               1440    95% 10080 override-lastmod reload-into-ims

Save& Exit.

STOREURL.PL

Now create storeurl.pl which will be used to pull youtube video from cache.

touch /etc/squid/storeurl.pl
chmod +x /etc/squid/storeurl.pl

Now edit this file and paste the following contents.

nano /etc/squid/storeurl.pl

#!/usr/bin/perl
# This script is NOT written or modified by me, I only copy pasted it from the internet.
# It was First originally Writen by chudy_fernandez@yahoo.com
# & Have been modified by various persons over the net to fix/add various functions.
# For Example this ver was modified by member of comstuff.net to satisfy common and dynamic content.
# th30nly @comstuff.net a.k.a invisible_theater ,
# For more info, http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube
$|=1;
while (<>) {
@X = split;
#       $X[1] =~ s/&sig=.*//;
$x = $X[0] . " ";
$_ = $X[1];
$u = $X[1];

#speedtest
if (m/^http:\/\/(.*)\/speedtest\/(.*\.(jpg|txt))\?(.*)/) {
print $x . "http://www.speedtest.net.SQUIDINTERNAL/speedtest/" . $2 . "\n";

#mediafire
}elsif (m/^http:\/\/199\.91\.15\d\.\d*\/\w{12}\/(\w*)\/(.*)/) {
print $x . "http://www.mediafire.com.SQUIDINTERNAL/" . $1 ."/" . $2 . "\n";

#fileserve
}elsif (m/^http:\/\/fs\w*\.fileserve\.com\/file\/(\w*)\/[\w-]*\.\/(.*)/) {
print $x . "http://www.fileserve.com.SQUIDINTERNAL/" . $1 . "./" . $2 . "\n";

#filesonic
}elsif (m/^http:\/\/s[0-9]*\.filesonic\.com\/download\/([0-9]*)\/(.*)/) {
print $x . "http://www.filesonic.com.SQUIDINTERNAL/" . $1 . "\n";

#4shared
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/download\/(.*)\/(.*\..*)\?.*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/download/$2\/$3\n";

#4shared preview
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/img\/(\d*)\/\w*\/dlink__2Fdownload_2F(\w*)_3Ftsid_3D[\w-]*\/preview\.mp3\?sId=\w*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/$2\n";

#photos-X.ak.fbcdn.net where X a-z
}elsif (m/^http:\/\/photos-[a-z](\.ak\.fbcdn\.net)(\/.*\/)(.*\.jpg)/) {
print $x . "http://photos" . $1 . "/" . $2 . $3  . "\n";

#YX.sphotos.ak.fbcdn.net where X 1-9, Y a-z
} elsif (m/^http:\/\/[a-z][0-9]\.sphotos\.ak\.fbcdn\.net\/(.*)\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  ."/". $2 . "\n";

#maps.google.com
} elsif (m/^http:\/\/(cbk|mt|khm|mlt|tbn)[0-9]?(.google\.co(m|\.uk|\.id).*)/) {
print $x . "http://" . $1  . $2 . "\n";

# compatibility for old cached get_video?video_id
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?(videoplayback\?id=.*?|video_id=.*?)\&(.*?)/) {
$z = $2; $z =~ s/video_id=/get_video?video_id=/;
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $z . "\n";

# youtube fix
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/videoplayback\?(.*)/) {
$p_str = $2;
$tag = "";
$alg = "";
$id = "";
$range = "";
if ($p_str =~ m/(itag=[0-9]*)/){$tag = "&".$1}
if ($p_str =~ m/(algorithm=[a-z]*\-[a-z]*)/){$alg = "&".$1}
if ($p_str =~ m/(id=[a-zA-Z0-9]*)/){$id = "&".$1}
if ($p_str =~ m/(range=[0-9\-]*)/){$range = "&".$1; $range =~ s/-//; $range =~ s/range=//; }
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $tag . "&" . $alg . "&" . $id . "&" . $range . "\n";

} elsif (m/^http:\/\/www\.google-analytics\.com\/__utm\.gif\?.*/) {
print $x . "http://www.google-analytics.com/__utm.gif\n";

#Cache High Latency Ads
} elsif (m/^http:\/\/([a-z0-9.]*)(\.doubleclick\.net|\.quantserve\.com|\.googlesyndication\.com|yieldmanager|cpxinteractive)(.*)/) {
$y = $3;$z = $2;
for ($y) {
s/pixel;.*/pixel/;
s/activity;.*/activity/;
s/(imgad[^&]*).*/\1/;
s/;ord=[?0-9]*//;
s/;&timestamp=[0-9]*//;
s/[&?]correlator=[0-9]*//;
s/&cookie=[^&]*//;
s/&ga_hid=[^&]*//;
s/&ga_vid=[^&]*//;
s/&ga_sid=[^&]*//;
# s/&prev_slotnames=[^&]*//
# s/&u_his=[^&]*//;
s/&dt=[^&]*//;
s/&dtd=[^&]*//;
s/&lmt=[^&]*//;
s/(&alternate_ad_url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&ref=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&cookie=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/[;&?]ord=[?0-9]*//;
s/[;&]mpvid=[^&;]*//;
s/&xpc=[^&]*//;
# yieldmanager
s/\?clickTag=[^&]*//;
s/&u=[^&]*//;
s/&slotname=[^&]*//;
s/&page_slots=[^&]*//;
}
print $x . "http://" . $1 . $2 . $y . "\n";

#cache high latency ads
} elsif (m/^http:\/\/(.*?)\/(ads)\?(.*?)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

# spicific servers starts here....
} elsif (m/^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*?)/) {
print $x . "http://" . $1 . "\n";

#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) && (m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#rapidshare
} elsif (($u =~ /rapidshare/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)([a-z]*\.[^\/]{3}\/[a-z]*\/[0-9]*)\/(.*?)\/([^\/\?\&]{4,})$/)) {
print $x . "http://cdn." . $3 . "/SQUIDINTERNAL/" . $5 . "\n";

} elsif (($u =~ /maxporn/) && (m/^http:\/\/([^\/]*?)\/(.*?)\/([^\/]*?)(\?.*)?$/)) {
print $x . "http://" . $1 . "/SQUIDINTERNAL/" . $3 . "\n";

#like porn hub variables url and center part of the path, filename etention 3 or 4 with or without ? at the end
} elsif (($u =~ /tube8|pornhub|xvideos/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?(\.[a-z]*)?)\.([a-z]*[0-9]?\.[^\/]{3}\/[a-z]*)(.*?)((\/[a-z]*)?(\/[^\/]*){4}\.[^\/\?]{3,4})(\?.*)?$/)) {
print $x . "http://cdn." . $4 . $6 . "\n";
#...spicific servers end here.

#photos-X.ak.fbcdn.net where X a-z
} elsif (m/^http:\/\/photos-[a-z].ak.fbcdn.net\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  . "\n";

#for yimg.com video
} elsif (m/^http:\/\/(.*yimg.com)\/\/(.*)\/([^\/\?\&]*\/[^\/\?\&]*\.[^\/\?\&]{3,4})(\?.*)?$/) {
print $x . "http://cdn.yimg.com//" . $3 . "\n";

#for yimg.com doubled
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*?)\.yimg\.com\/(.*?)\?(.*)/) {
print $x . "http://cdn.yimg.com/"  . $3 . "\n";

#for yimg.com with &sig=
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*)/) {
@y = ($1,$2);
$y[0] =~ s/[a-z]+[0-9]+/cdn/;
$y[1] =~ s/&sig=.*//;
print $x . "http://" . $y[0] . ".yimg.com/"  . $y[1] . "\n";

#youjizz. We use only domain and filename
} elsif (($u =~ /media[0-9]{2,5}\.youjizz/) && (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\/([^\/\?\&]*)\.([^\/\?\&]{3,4})((\?|\%).*)?$/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#general purpose for cdn servers. add above your specific servers.
} elsif (m/^http:\/\/([0-9.]*?)\/\/(.*?)\.(.*)\?(.*?)/) {
print $x . "http://squid-cdn-url//" . $2  . "." . $3 . "\n";

#generic http://variable.domain.com/path/filename."ex" "ext" or "exte" with or withour "? or %"
} elsif (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\.([^\/\?\&]{2,4})((\?|\%).*)?$/) {
@y = ($1,$2,$3,$4);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

# generic http://variable.domain.com/...
} elsif (m/^http:\/\/(([A-Za-z]+[0-9-]+)*?|.*cdn.*|.*cache.*)\.(.*?)\.(.*?)\/(.*)$/) {
print $x . "http://cdn." . $3 . "." . $4 . "/" . $5 .  "\n";

# spicific extention that ends with ?
} elsif (m/^http:\/\/(.*?)\/(.*?)\.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|on2)(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "." . $3 . "\n";

# all that ends with ;
} elsif (m/^http:\/\/(.*?)\/(.*?)\;(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

} else {
print $x . $_ . "sucks\n";
}
}

Save & Exit.

Now create cache dir and assign proper permission to proxy user

mkdir /cache1
chown proxy:proxy /cache1
chmod -R  777 /cache1

Now  initialize squid cache directories by

squid -z

You should see Following message

Creating Swap Directories

After this, start SQUID service by

service squid start

From Client end, point your browser to use your squid as proxy server and test any video.

TESTING YOUTUBE CACHING :-)

From client side, View any youtube video, after viewing, clear your cache and play it again (or from another pc) and you will notice that the video loads instantly,

Example:

You can monitor the CACHE TCP_HIT ENTRIES in squid logs, you can view them by

tail -f /var/log/squid/access.log | grep HIT

More information can be found at

http://aacable.wordpress.com/tag/aacable-howto-cache-youtube/

If you face “An error occured” in cached videos , see the following

http://aacable.wordpress.com/2012/01/30/youtube-caching-problem-an-error-occured-please-try-again-later-solved/

Regard’s

Syed Jahanzaib

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 250 other followers