Syed Jahanzaib Personnel Blog to Share Knowledge !

January 30, 2012

Youtube Caching Problem : An error occured. Please try again later. [SOLVED] updated storeurl.pl


YOUTUBE CACHING WITH SQUID -by- Syed Jahanzaib

YOUTUBE CACHING WITH SQUID -by- Syed Jahanzaib

 UPDATED: 10th MAY, 2012

====================================================================

BAD NEWS:

Some research updates: (Someone please confirm and post your comments)

YOUTUBE
has split its videos into segments of 1.7 Mb each which is around 51 seconds of video clip length.
I am sure YOUTUBE have taken this step to prevent people from caching  entire videos.
If you have a video which is 100 Mb large, it will be split into about 55-60 segments.
[I consider it as a benefit , it will prevent squid from filling quickly, because many users dont watch video entirely and jump to next one without finishing the video.]
As of right now, storeurl.pl is not be able to cache Youtube Video entirely, it is caching only initial 51 seconds of the clip.
Currently VIDEOCACHE plugin is doing full cache of youtube but at higher  400  $$$ cost yearly :| [Syed JahanZaib]

You can also try this updated version of youtube caching with NGINX.

http://aacable.wordpress.com/2012/08/13/youtube-caching-with-squid-nginx/

====================================================================

If you are caching youtube using storeurl.pl method
Example: http://aacable.wordpress.com/2012/01/19/youtube-caching-with-squid-2-7-using-storeurl-pl/ 

and you encounter following error while watching any cached video
An error occured, Please try again later
As showed in the image below . . .

Then try using the following updated storeurl.pl

#!/usr/bin/perl
# This script is NOT written or modified by me, I only copy pasted it from the internet.
# It was First originally Writen by chudy_fernandez@yahoo.com
# & Have been modified by various persons over the net to fix/add various functions.
# Like For Example modified by member of comstuff.net to satisfy common and dynamic content.
# th30nly @comstuff.net a.k.a invisible_theater , and possibly other people too.
# For more info, http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube
# Syed Jahanzaib / aacable @ hotmail . com

$|=1;
while (<>) {
@X = split;
#       $X[1] =~ s/&sig=.*//;
$x = $X[0] . " ";
$_ = $X[1];
$u = $X[1];

if ($X[1] =~ /(youtube|google).*videoplayback\?/){
@itag = m/[&?](itag=[0-9]*)/;
@id = m/[&?](id=[^\&]*)/;
@range = m/[&?](range=[^\&\s]*)/;
@begin = m/[&?](begin=[^\&\s]*)/;
@redirect = m/[&?](redirect_counter=[^\&]*)/;
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/@itag&@id&@range@begin@redirect\n";

#speedtest
}elsif (m/^http:\/\/(.*)\/speedtest\/(.*\.(jpg|txt))\?(.*)/) {
print $x . "http://www.speedtest.net.SQUIDINTERNAL/speedtest/" . $2 . "\n";

#mediafire
}elsif (m/^http:\/\/199\.91\.15\d\.\d*\/\w{12}\/(\w*)\/(.*)/) {
print $x . "http://www.mediafire.com.SQUIDINTERNAL/" . $1 ."/" . $2 . "\n";

#fileserve
}elsif (m/^http:\/\/fs\w*\.fileserve\.com\/file\/(\w*)\/[\w-]*\.\/(.*)/) {
print $x . "http://www.fileserve.com.SQUIDINTERNAL/" . $1 . "./" . $2 . "\n";

#filesonic
}elsif (m/^http:\/\/s[0-9]*\.filesonic\.com\/download\/([0-9]*)\/(.*)/) {
print $x . "http://www.filesonic.com.SQUIDINTERNAL/" . $1 . "\n";

#4shared
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/download\/(.*)\/(.*\..*)\?.*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/download/$2\/$3\n";

#4shared preview
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/img\/(\d*)\/\w*\/dlink__2Fdownload_2F(\w*)_3Ftsid_3D[\w-]*\/preview\.mp3\?sId=\w*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/$2\n";

#photos-X.ak.fbcdn.net where X a-z
}elsif (m/^http:\/\/photos-[a-z](\.ak\.fbcdn\.net)(\/.*\/)(.*\.jpg)/) {
print $x . "http://photos" . $1 . "/" . $2 . $3  . "\n";

#YX.sphotos.ak.fbcdn.net where X 1-9, Y a-z
} elsif (m/^http:\/\/[a-z][0-9]\.sphotos\.ak\.fbcdn\.net\/(.*)\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  ."/". $2 . "\n";

#maps.google.com
} elsif (m/^http:\/\/(cbk|mt|khm|mlt|tbn)[0-9]?(.google\.co(m|\.uk|\.id).*)/) {
print $x . "http://" . $1  . $2 . "\n";

# compatibility for old cached get_video?video_id
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?(videoplayback\?id=.*?|video_id=.*?)\&(.*?)/) {
$z = $2; $z =~ s/video_id=/get_video?video_id=/;
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $z . "\n";

} elsif (m/^http:\/\/www\.google-analytics\.com\/__utm\.gif\?.*/) {
print $x . "http://www.google-analytics.com/__utm.gif\n";

#Cache High Latency Ads
} elsif (m/^http:\/\/([a-z0-9.]*)(\.doubleclick\.net|\.quantserve\.com|\.googlesyndication\.com|yieldmanager|cpxinteractive)(.*)/) {
$y = $3;$z = $2;
for ($y) {
s/pixel;.*/pixel/;
s/activity;.*/activity/;
s/(imgad[^&]*).*/\1/;
s/;ord=[?0-9]*//;
s/;&timestamp=[0-9]*//;
s/[&?]correlator=[0-9]*//;
s/&cookie=[^&]*//;
s/&ga_hid=[^&]*//;
s/&ga_vid=[^&]*//;
s/&ga_sid=[^&]*//;
# s/&prev_slotnames=[^&]*//
# s/&u_his=[^&]*//;
s/&dt=[^&]*//;
s/&dtd=[^&]*//;
s/&lmt=[^&]*//;
s/(&alternate_ad_url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&ref=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&cookie=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/[;&?]ord=[?0-9]*//;
s/[;&]mpvid=[^&;]*//;
s/&xpc=[^&]*//;
# yieldmanager
s/\?clickTag=[^&]*//;
s/&u=[^&]*//;
s/&slotname=[^&]*//;
s/&page_slots=[^&]*//;
}
print $x . "http://" . $1 . $2 . $y . "\n";

#cache high latency ads
} elsif (m/^http:\/\/(.*?)\/(ads)\?(.*?)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

# spicific servers starts here....
} elsif (m/^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*?)/) {
print $x . "http://" . $1 . "\n";

#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) && (m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#rapidshare
} elsif (($u =~ /rapidshare/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)([a-z]*\.[^\/]{3}\/[a-z]*\/[0-9]*)\/(.*?)\/([^\/\?\&]{4,})$/)) {
print $x . "http://cdn." . $3 . "/SQUIDINTERNAL/" . $5 . "\n";

} elsif (($u =~ /maxporn/) && (m/^http:\/\/([^\/]*?)\/(.*?)\/([^\/]*?)(\?.*)?$/)) {
print $x . "http://" . $1 . "/SQUIDINTERNAL/" . $3 . "\n";

#like porn hub variables url and center part of the path, filename etention 3 or 4 with or without ? at the end
} elsif (($u =~ /tube8|pornhub|xvideos/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?(\.[a-z]*)?)\.([a-z]*[0-9]?\.[^\/]{3}\/[a-z]*)(.*?)((\/[a-z]*)?(\/[^\/]*){4}\.[^\/\?]{3,4})(\?.*)?$/)) {
print $x . "http://cdn." . $4 . $6 . "\n";
#...spicific servers end here.

#photos-X.ak.fbcdn.net where X a-z
} elsif (m/^http:\/\/photos-[a-z].ak.fbcdn.net\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  . "\n";

#for yimg.com video
} elsif (m/^http:\/\/(.*yimg.com)\/\/(.*)\/([^\/\?\&]*\/[^\/\?\&]*\.[^\/\?\&]{3,4})(\?.*)?$/) {
print $x . "http://cdn.yimg.com//" . $3 . "\n";

#for yimg.com doubled
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*?)\.yimg\.com\/(.*?)\?(.*)/) {
print $x . "http://cdn.yimg.com/"  . $3 . "\n";

#for yimg.com with &sig=
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*)/) {
@y = ($1,$2);
$y[0] =~ s/[a-z]+[0-9]+/cdn/;
$y[1] =~ s/&sig=.*//;
print $x . "http://" . $y[0] . ".yimg.com/"  . $y[1] . "\n";

#youjizz. We use only domain and filename
} elsif (($u =~ /media[0-9]{2,5}\.youjizz/) && (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\/([^\/\?\&]*)\.([^\/\?\&]{3,4})((\?|\%).*)?$/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#general purpose for cdn servers. add above your specific servers.
} elsif (m/^http:\/\/([0-9.]*?)\/\/(.*?)\.(.*)\?(.*?)/) {
print $x . "http://squid-cdn-url//" . $2  . "." . $3 . "\n";

#generic http://variable.domain.com/path/filename."ex" "ext" or "exte" with or withour "? or %"
} elsif (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\.([^\/\?\&]{2,4})((\?|\%).*)?$/) {
@y = ($1,$2,$3,$4);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

# generic http://variable.domain.com/...
} elsif (m/^http:\/\/(([A-Za-z]+[0-9-]+)*?|.*cdn.*|.*cache.*)\.(.*?)\.(.*?)\/(.*)$/) {
print $x . "http://cdn." . $3 . "." . $4 . "/" . $5 .  "\n";

# spicific extention that ends with ?
} elsif (m/^http:\/\/(.*?)\/(.*?)\.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|wmv|3gp|mp(4|3)|exe|msi|zip|on2|mar|rar|cab|amf|swf)(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "." . $3 . "\n";

# all that ends with ;
} elsif (m/^http:\/\/(.*?)\/(.*?)\;(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

} else {
print $x . $_ . "sucks\n";
}
}

I have deployed this storeurl.pl at various different networks, and in my LAB, and so far result is good , although sometimes user still face errors in few videos, but the ratio is very less.
Also try this update:
storeurl_rewrite_children 15
storeurl_rewrite_concurrency 100

Please post your comments regarding updated storeurl.pl

.
.
.

Regard’s
SYED JAHANZAIB

January 19, 2012

Youtube caching with SQUID 2.7 [using storeurl.pl]

Filed under: Linux Related — Tags: , — Syed Jahanzaib / Pinochio~:) @ 11:34 AM

UPDATED: 10th MAY, 2012

BAD NEWS:

Some research update (Someone Please confirm & post your comments)

YOUTUBE has split its videos into segments of 1.7 Mb each which is the approximation of the 51 seconds. I am sure YOUTUBE have taken this step to prevent people from caching  entire videos, [In my opinion its good to prevent squid from filling up squickly]. If you are watching a a video which is 100 Mb large, it will be split into about 55-60 segments.
As of right now, storeurl.pl wont be able to cache entire video, only first 51 seconds will be cached.
As of right now VIDEOCACHE plugin is doing full cache of youtube videos but at higher 400 $ Per Year price :)

UPDATED: OCTOBER, 2012

Following method is working to cache youtube. its working almost 100%. Read more at

http://aacable.wordpress.com/2012/08/13/youtube-caching-with-squid-nginx/

UPDATED: 2nd April, 2012

This is a Shorten Version of  http://aacable.wordpress.com/tag/aacable-howto-cache-youtube/

This is a quick reference guide for SQUID 2.7 installation on Ubuntu Desktop ver 10.4 with youtube caching supported. Make sure you have setup proper internet connection in Ubuntu BOX.

Install SQUID by

apt-get install squid

After installation done, Now edit it’s configuration file

nano /etc/squid/squid.conf

remove current lines and paste all squid.conf as follows.

# SQUID 2.7 CONFIG FILE
# By - Syed Jahanzaib</pre>
# Email: aacable@hotmail.com
# Web  : http://aacable.wordpress.com

# PORT and Transparent Option
http_port 8080 transparent
server_http11 on
icp_port 0

# Cache Directory , modify it according to your system.
# but first create directory in root by mkdir /cache1
# and then issue this command  chown proxy:proxy /cache1
# [for ubuntu user is proxy, in Fedora user is SQUID]
# I have set 200 GB for caching
# adjust it according to your need.
# My recommendation is to have one cache_dir per drive. zzz

store_dir_select_algorithm round-robin
cache_dir aufs /cache1 200000 16 256
#cache_dir ufs /mnt/hdd2/cache2 200000 16 256 # If you have secondary HDD
memory_replacement_policy heap GDSF
cache_replacement_policy heap GDSF

# If you want to enable DATE time n SQUID Logs,use following
emulate_httpd_log on
logformat squid %tl %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt
log_fqdn off

# How much days to keep users access web logs
# You need to rotate your log files with a cron job. For example:
# 0 0 * * * /usr/local/squid/bin/squid -k rotate
logfile_rotate 14
debug_options ALL,1
cache_access_log /var/log/squid/access.log
cache_log none
cache_store_log none

#acl adsites dstdomain url_regex "/etc/squid/adslist.txt"
#http_access deny adsites
#deny_info http://192.168.6.1/psb.htm adsites

#I used DNSAMSQ service for fast dns resolving
#so install by using "apt-get install dnsmasq" first
dns_nameservers 127.0.0.1 221.132.112.8
ftp_user anonymous@
ftp_list_width 32
ftp_passive on
ftp_sanitycheck on

#ACL Section mylan myacl
acl all src 0.0.0.0/0.0.0.0
#acl all src 192.168.50.0/255.255.255.0
#acl all2 src 10.0.0.0/255.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
acl SSL_ports port 443 563 # https, snews
acl SSL_ports port 873 # rsync
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 563 # https, snews
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl Safe_ports port 631 # cups
acl Safe_ports port 873 # rsync
acl Safe_ports port 901 # SWAT
acl purge method PURGE
acl CONNECT method CONNECT
http_access allow manager all
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost
http_access allow all
#http_access allow all2
http_reply_access allow all
#http_reply_access allow all2
icp_access allow all

#==========================
# Administrative Parameters
#==========================

#============================================================$
# SNMP , if you want to generate graphs for SQUID via MRTG
#============================================================$
#acl snmppublic snmp_community public
#snmp_port 3401
#snmp_access allow snmppublic all
#snmp_access allow all

# I used UBUNTU so user is proxy, in FEDORA you may use use squid
cache_effective_user proxy
cache_effective_group proxy
cache_mgr SAJID
visible_hostname aacable.wordpress.com
unique_hostname aacable@hotmail.com

# Memory
cache_mem 8 MB
minimum_object_size 0 bytes
maximum_object_size 700 MB
maximum_object_size_in_memory 32 KB

tcp_outgoing_tos 0x30 all
zph_mode tos
zph_local 0x30
zph_parent 0
zph_option 136

acl store_rewrite_list urlpath_regex            \/(get_video|videoplayback\?id|videoplayback.*id)
acl store_rewrite_list urlpath_regex            \.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|wmv|3gp|mp(4|3)|exe|msi|zip|on2|mar)\?
acl store_rewrite_list_domain url_regex         ^http:\/\/([a-zA-Z-]+[0-9-]+)\.[A-Za-z]*\.[A-Za-z]*
acl store_rewrite_list_domain url_regex         (([a-z]{1,2}[0-9]{1,3})|([0-9]{1,3}[a-z]{1,2}))\.[a-z]*[0-9]?\.[a-z]{3}
acl store_rewrite_list_path urlpath_regex       \.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|avc|zip|mp3|3gp|rar|on2|mar|exe)$
acl store_rewrite_list_domain_CDN url_regex     \.rapidshare\.com.*\/[0-9]*\/.*\/[^\/]* ^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*) \.doubleclick\.net.*
acl store_rewrite_list_domain_CDN url_regex     ^http:\/\/[.a-z0-9]*\.photobucket\.com.*\.[a-z]{3}$ quantserve\.com
acl store_rewrite_list_domain_CDN url_regex     ^http:\/\/[a-z]+[0-9]\.google\.co(m|\.id)
acl store_rewrite_list_domain_CDN url_regex     ^http:\/\/\.www[0-9][0-9]\.indowebster\.com\/(.*)(rar|zip|flv|wm(a|v)|3gp|mp(4|3)|exe|msi|avi|(mp(e?g|a|e|1|2|3|4))|cab|exe)
acl dontrewrite url_regex redbot\.org \.php
acl getmethod method GET

storeurl_access deny dontrewrite
storeurl_access deny !getmethod
storeurl_access allow store_rewrite_list_domain_CDN
storeurl_access allow store_rewrite_list
storeurl_access allow store_rewrite_list_domain
storeurl_access allow store_rewrite_list_path
storeurl_access deny all

storeurl_rewrite_program /etc/squid/storeurl.pl
storeurl_rewrite_children 7
storeurl_rewrite_concurrency 0

##
refresh_pattern -i \.htm 120 50% 10080 reload-into-ims
refresh_pattern -i \.html 120 50% 10080 reload-into-ims
refresh_pattern ^http://*.facebook.com/* 720 100% 4320
refresh_pattern ^http://mail.yahoo.com/.* 720 100% 4320
refresh_pattern ^http://*.yahoo.*/.* 720 100% 4320
refresh_pattern ^http://*.yimg.*/.* 720 100% 4320
refresh_pattern ^http://*.gmail.*/.* 720 100% 4320
refresh_pattern ^http://*.google.*/.* 720 100% 4320
refresh_pattern ^http://*.kaskus.*/.* 720 100% 4320
refresh_pattern ^http://*.googlesyndication.*/.* 720 100% 4320
refresh_pattern ^http://*.plasa.*/.* 720 100% 4320
refresh_pattern ^http://*.telkom.*/.* 720 100% 4320
##

# 1 year = 525600 mins, 1 month = 43800 mins
refresh_pattern imeem.*\.flv  0 0% 0     override-lastmod override-expire
refresh_pattern \.rapidshare.*\/[0-9]*\/.*\/[^\/]*   161280    90%    161280 ignore-reload

refresh_pattern (get_video\?|videoplayback\?|videodownload\?|\.flv?)    10800 80% 10800 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern (get_video\?|videoplayback\?id|videoplayback.*id|videodownload\?|\.flv?)    10800 80% 10800 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
#refresh_pattern -i (get_video\?|videoplayback\?id|videoplayback.*id||videodownload\?|\.flv?)       10800 80% 10800 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern \.(ico|video-stats) 10800 80% 10800    override-expire ignore-reload ignore-no-cache  ignore-private ignore-auth override-lastmod  negative-ttl=10080
refresh_pattern \.etology\?                       10800 80% 10800    override-expire ignore-reload ignore-no-cache
refresh_pattern galleries\.video(\?|sz)               10800 80% 10800    override-expire ignore-reload ignore-no-cache
refresh_pattern brazzers\?                       10800 80% 10800    override-expire ignore-reload ignore-no-cache
refresh_pattern \.adtology\?                      10800 80% 10800    override-expire ignore-reload ignore-no-cache
refresh_pattern ^.*(utm\.gif|ads\?|rmxads\.com|ad\.z5x\.net|bh\.contextweb\.com|bstats\.adbrite\.com|a1\.interclick\.com|ad\.trafficmp\.com|ads\.cubics\.com|ad\.xtendmedia\.com|\.googlesyndication\.com|advertising\.com|yieldmanager|game-advertising\.com|pixel\.quantserve\.com|adperium\.com|doubleclick\.net|adserving\.cpxinteractive\.com|syndication\.com|media.fastclick.net).* 10800 20% 10800 ignore-no-cache  ignore-private override-expire ignore-reload ignore-auth   negative-ttl=40320 max-stale=10
refresh_pattern ^.*safebrowsing.*google  10800 80% 10800 override-expire ignore-reload ignore-no-cache ignore-private ignore-auth  negative-ttl=10080
refresh_pattern ^http://((cbk|mt|khm|mlt)[0-9]?)\.google\.co(m|\.uk) 10800 80% 10800 override-expire ignore-reload   ignore-private  negative-ttl=10080
refresh_pattern ytimg\.com.*\.jpg                   10800 80% 10800    override-expire ignore-reload
refresh_pattern images\.friendster\.com.*\.(png|gif)           10800 80% 10800    override-expire ignore-reload
refresh_pattern garena\.com                                   10800 80% 10800     override-expire reload-into-ims
refresh_pattern photobucket.*\.(jp(e?g|e|2)|tiff?|bmp|gif|png)  10800 80% 10800     override-expire ignore-reload
refresh_pattern vid\.akm\.dailymotion\.com.*\.on2\?           10800 80% 10800 ignore-no-cache override-expire override-lastmod
refresh_pattern mediafire.com\/images.*\.(jp(e?g|e|2)|tiff?|bmp|gif|png)    10800 80% 10800 reload-into-ims override-expire ignore-private
refresh_pattern ^http:\/\/images|pics|thumbs[0-9]\.      10800 80% 10800 reload-into-ims ignore-no-cache  ignore-reload override-expire
refresh_pattern ^http:\/\/www.onemanga.com.*\/           10800 80% 10800 reload-into-ims ignore-no-cache  ignore-reload override-expire

# ANTI VIRUS
refresh_pattern guru.avg.com/.*\.(bin)                      10800 80% 10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern (avgate|avira).*(idx|gz)$                           10800 80% 10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern kaspersky.*\.avc$                                   10800 80% 10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern kaspersky                                           10800 80% 10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern update.nai.com/.*\.(gem|zip|mcs)                    10800 80% 10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern ^http:\/\/liveupdate.symantecliveupdate.com.*\(zip)     10800 80% 10800 ignore-no-cache  ignore-reload  reload-into-ims

refresh_pattern windowsupdate.com/.*\.(cab|exe)             10800  80%  10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern update.microsoft.com/.*\.(cab|exe)             10800  80%  10800 ignore-no-cache  ignore-reload  reload-into-ims
refresh_pattern download.microsoft.com/.*\.(cab|exe)             10800  80%  10800 ignore-no-cache  ignore-reload  reload-into-ims

#images facebook
refresh_pattern ((facebook.com)|(85.131.151.39)).*\.(jpg|png|gif)      10800 80% 10800 ignore-reload  override-expire ignore-no-cache
refresh_pattern -i \.fbcdn.net.*\.(jpg|gif|png|swf|mp3)                  10800 80% 10800 ignore-reload  override-expire ignore-no-cache
refresh_pattern  static\.ak\.fbcdn\.net*\.(jpg|gif|png)                  10800 80% 10800 ignore-reload  override-expire ignore-no-cache
refresh_pattern ^http:\/\/profile\.ak\.fbcdn.net*\.(jpg|gif|png)      10800 80% 10800 ignore-reload  override-expire ignore-no-cache

#banner IIX
refresh_pattern ^http:\/\/openx.*\.(jp(e?g|e|2)|gif|pn[pg]|swf|ico|css|tiff?) 10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache
refresh_pattern ^http:\/\/ads(1|2|3).kompas.com.*\/           10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache
refresh_pattern ^http:\/\/img.ads.kompas.com.*\/           10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache
refresh_pattern .kompasimages.com.*\.(jpg|gif|png|swf)       10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache
refresh_pattern ^http:\/\/openx.kompas.com.*\/           10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache
refresh_pattern kaskus.\us.*\.(jp(e?g|e|2)|gif|png|swf)        10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache
refresh_pattern ^http:\/\/img.kaskus.us.*\.(jpg|gif|png|swf)       10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache

#IIX DOWNLOAD
refresh_pattern ^http:\/\/\.www[0-9][0-9]\.indowebster\.com\/(.*)(mp3|rar|zip|flv|wmv|3gp|mp(4|3)|exe|msi|zip) 10800 99999% 10800 reload-into-ims  ignore-reload override-expire ignore-no-cache    ignore-auth

#All File
refresh_pattern -i \.(3gp|7z|ace|asx|avi|bin|cab|dat|deb|divx|dvr-ms)      10800 80% 10800 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(rar|jar|gz|tgz|bz2|iso|m1v|m2(v|p)|mo(d|v))          10800 80% 10800 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(jp(e?g|e|2)|gif|pn[pg]|bm?|tiff?|ico|swf|css|js)     10800 80% 10800 ignore-no-cache  ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(mp(e?g|a|e|1|2|3|4)|mk(a|v)|ms(i|u|p)|og(x|v|a|g)|rar|rm|r(a|p)m|snd|vob|wav) 10800 80% 10800 ignore-no-cache ignore-private override-expire override-lastmod reload-into-ims
refresh_pattern -i \.(pp(s|t)|wax|wm(a|v)|wmx|wpl|zip|cb(r|z|t))     10800 80% 10800 ignore-no-cache ignore-private override-expire override-lastmod reload-into-ims

refresh_pattern (cgi-bin|\?)       0      0%      0
refresh_pattern ^gopher:    1440    0%    1440
refresh_pattern ^ftp:         10080     95%     10800 override-lastmod reload-into-ims
refresh_pattern         .     180     95% 10800 override-lastmod reload-into-ims

global_internal_static off
max_stale 10 years
retry_on_error on
buffered_logs on
read_ahead_gap 32 KB

header_access Accept-Encoding deny  all
client_persistent_connections off
server_persistent_connections on
half_closed_clients off
strip_query_terms off
quick_abort_min 0 KB
quick_abort_max 0 KB
quick_abort_pct 100
vary_ignore_expire on
reload_into_ims on
pipeline_prefetch on
read_timeout 30 minutes
client_lifetime 6 hours
negative_ttl 30 seconds
positive_dns_ttl 6 hours
negative_dns_ttl 60 seconds
pconn_timeout 15 seconds
request_timeout 1 minute
store_avg_object_size 13 KB
log_icp_queries off
ipcache_size 16384
ipcache_low 98
ipcache_high 99
log_fqdn off
fqdncache_size 16384
memory_pools off
forwarded_for on
client_db on
max_filedescriptors 8192

Save& Exit.

STOREURL.PL

Now create storeurl.pl which will be used to pull youtube video from cache.

touch /etc/squid/storeurl.pl
chmod +x /etc/squid/storeurl.pl

Now edit this file and paste the following contents.

nano /etc/squid/storeurl.pl

#!/usr/bin/perl
# This script is NOT written or modified by me, I only copy pasted it from the internet.
# It was First originally Writen by chudy_fernandez@yahoo.com
# & Have been modified by various persons over the net to fix/add various functions.
# For Example this ver was modified by member of comstuff.net to satisfy common and dynamic content.
# th30nly @comstuff.net a.k.a invisible_theater ,
# For more info, http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube
$|=1;
while (<>) {
@X = split;
#       $X[1] =~ s/&sig=.*//;
$x = $X[0] . " ";
$_ = $X[1];
$u = $X[1];

#speedtest
if (m/^http:\/\/(.*)\/speedtest\/(.*\.(jpg|txt))\?(.*)/) {
print $x . "http://www.speedtest.net.SQUIDINTERNAL/speedtest/" . $2 . "\n";

#mediafire
}elsif (m/^http:\/\/199\.91\.15\d\.\d*\/\w{12}\/(\w*)\/(.*)/) {
print $x . "http://www.mediafire.com.SQUIDINTERNAL/" . $1 ."/" . $2 . "\n";

#fileserve
}elsif (m/^http:\/\/fs\w*\.fileserve\.com\/file\/(\w*)\/[\w-]*\.\/(.*)/) {
print $x . "http://www.fileserve.com.SQUIDINTERNAL/" . $1 . "./" . $2 . "\n";

#filesonic
}elsif (m/^http:\/\/s[0-9]*\.filesonic\.com\/download\/([0-9]*)\/(.*)/) {
print $x . "http://www.filesonic.com.SQUIDINTERNAL/" . $1 . "\n";

#4shared
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/download\/(.*)\/(.*\..*)\?.*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/download/$2\/$3\n";

#4shared preview
}elsif (m/^http:\/\/[a-zA-Z]{2}\d*\.4shared\.com(:8080|)\/img\/(\d*)\/\w*\/dlink__2Fdownload_2F(\w*)_3Ftsid_3D[\w-]*\/preview\.mp3\?sId=\w*/) {
print $x . "http://www.4shared.com.SQUIDINTERNAL/$2\n";

#photos-X.ak.fbcdn.net where X a-z
}elsif (m/^http:\/\/photos-[a-z](\.ak\.fbcdn\.net)(\/.*\/)(.*\.jpg)/) {
print $x . "http://photos" . $1 . "/" . $2 . $3  . "\n";

#YX.sphotos.ak.fbcdn.net where X 1-9, Y a-z
} elsif (m/^http:\/\/[a-z][0-9]\.sphotos\.ak\.fbcdn\.net\/(.*)\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  ."/". $2 . "\n";

#maps.google.com
} elsif (m/^http:\/\/(cbk|mt|khm|mlt|tbn)[0-9]?(.google\.co(m|\.uk|\.id).*)/) {
print $x . "http://" . $1  . $2 . "\n";

# compatibility for old cached get_video?video_id
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?(videoplayback\?id=.*?|video_id=.*?)\&(.*?)/) {
$z = $2; $z =~ s/video_id=/get_video?video_id=/;
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $z . "\n";

# youtube fix
} elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/videoplayback\?(.*)/) {
$p_str = $2;
$tag = "";
$alg = "";
$id = "";
$range = "";
if ($p_str =~ m/(itag=[0-9]*)/){$tag = "&".$1}
if ($p_str =~ m/(algorithm=[a-z]*\-[a-z]*)/){$alg = "&".$1}
if ($p_str =~ m/(id=[a-zA-Z0-9]*)/){$id = "&".$1}
if ($p_str =~ m/(range=[0-9\-]*)/){$range = "&".$1; $range =~ s/-//; $range =~ s/range=//; }
print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $tag . "&" . $alg . "&" . $id . "&" . $range . "\n";

} elsif (m/^http:\/\/www\.google-analytics\.com\/__utm\.gif\?.*/) {
print $x . "http://www.google-analytics.com/__utm.gif\n";

#Cache High Latency Ads
} elsif (m/^http:\/\/([a-z0-9.]*)(\.doubleclick\.net|\.quantserve\.com|\.googlesyndication\.com|yieldmanager|cpxinteractive)(.*)/) {
$y = $3;$z = $2;
for ($y) {
s/pixel;.*/pixel/;
s/activity;.*/activity/;
s/(imgad[^&]*).*/\1/;
s/;ord=[?0-9]*//;
s/;&timestamp=[0-9]*//;
s/[&?]correlator=[0-9]*//;
s/&cookie=[^&]*//;
s/&ga_hid=[^&]*//;
s/&ga_vid=[^&]*//;
s/&ga_sid=[^&]*//;
# s/&prev_slotnames=[^&]*//
# s/&u_his=[^&]*//;
s/&dt=[^&]*//;
s/&dtd=[^&]*//;
s/&lmt=[^&]*//;
s/(&alternate_ad_url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&url=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&ref=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/(&cookie=http%3A%2F%2F[^(%2F)]*)[^&]*/\1/;
s/[;&?]ord=[?0-9]*//;
s/[;&]mpvid=[^&;]*//;
s/&xpc=[^&]*//;
# yieldmanager
s/\?clickTag=[^&]*//;
s/&u=[^&]*//;
s/&slotname=[^&]*//;
s/&page_slots=[^&]*//;
}
print $x . "http://" . $1 . $2 . $y . "\n";

#cache high latency ads
} elsif (m/^http:\/\/(.*?)\/(ads)\?(.*?)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

# spicific servers starts here....
} elsif (m/^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*?)/) {
print $x . "http://" . $1 . "\n";

#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) && (m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#rapidshare
} elsif (($u =~ /rapidshare/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)([a-z]*\.[^\/]{3}\/[a-z]*\/[0-9]*)\/(.*?)\/([^\/\?\&]{4,})$/)) {
print $x . "http://cdn." . $3 . "/SQUIDINTERNAL/" . $5 . "\n";

} elsif (($u =~ /maxporn/) && (m/^http:\/\/([^\/]*?)\/(.*?)\/([^\/]*?)(\?.*)?$/)) {
print $x . "http://" . $1 . "/SQUIDINTERNAL/" . $3 . "\n";

#like porn hub variables url and center part of the path, filename etention 3 or 4 with or without ? at the end
} elsif (($u =~ /tube8|pornhub|xvideos/) && (m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?(\.[a-z]*)?)\.([a-z]*[0-9]?\.[^\/]{3}\/[a-z]*)(.*?)((\/[a-z]*)?(\/[^\/]*){4}\.[^\/\?]{3,4})(\?.*)?$/)) {
print $x . "http://cdn." . $4 . $6 . "\n";
#...spicific servers end here.

#photos-X.ak.fbcdn.net where X a-z
} elsif (m/^http:\/\/photos-[a-z].ak.fbcdn.net\/(.*)/) {
print $x . "http://photos.ak.fbcdn.net/" . $1  . "\n";

#for yimg.com video
} elsif (m/^http:\/\/(.*yimg.com)\/\/(.*)\/([^\/\?\&]*\/[^\/\?\&]*\.[^\/\?\&]{3,4})(\?.*)?$/) {
print $x . "http://cdn.yimg.com//" . $3 . "\n";

#for yimg.com doubled
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*?)\.yimg\.com\/(.*?)\?(.*)/) {
print $x . "http://cdn.yimg.com/"  . $3 . "\n";

#for yimg.com with &sig=
} elsif (m/^http:\/\/(.*?)\.yimg\.com\/(.*)/) {
@y = ($1,$2);
$y[0] =~ s/[a-z]+[0-9]+/cdn/;
$y[1] =~ s/&sig=.*//;
print $x . "http://" . $y[0] . ".yimg.com/"  . $y[1] . "\n";

#youjizz. We use only domain and filename
} elsif (($u =~ /media[0-9]{2,5}\.youjizz/) && (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\/([^\/\?\&]*)\.([^\/\?\&]{3,4})((\?|\%).*)?$/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

#general purpose for cdn servers. add above your specific servers.
} elsif (m/^http:\/\/([0-9.]*?)\/\/(.*?)\.(.*)\?(.*?)/) {
print $x . "http://squid-cdn-url//" . $2  . "." . $3 . "\n";

#generic http://variable.domain.com/path/filename."ex" "ext" or "exte" with or withour "? or %"
} elsif (m/^http:\/\/(.*)(\.[^\.\-]*?\..*?)\/(.*)\.([^\/\?\&]{2,4})((\?|\%).*)?$/) {
@y = ($1,$2,$3,$4);
$y[0] =~ s/(([a-zA-A]+[0-9]+(-[a-zA-Z])?$)|(.*cdn.*)|(.*cache.*))/cdn/;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";

# generic http://variable.domain.com/...
} elsif (m/^http:\/\/(([A-Za-z]+[0-9-]+)*?|.*cdn.*|.*cache.*)\.(.*?)\.(.*?)\/(.*)$/) {
print $x . "http://cdn." . $3 . "." . $4 . "/" . $5 .  "\n";

# spicific extention that ends with ?
} elsif (m/^http:\/\/(.*?)\/(.*?)\.(jp(e?g|e|2)|gif|png|tiff?|bmp|ico|flv|on2)(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "." . $3 . "\n";

# all that ends with ;
} elsif (m/^http:\/\/(.*?)\/(.*?)\;(.*)/) {
print $x . "http://" . $1 . "/" . $2  . "\n";

} else {
print $x . $_ . "sucks\n";
}
}

Save & Exit.

Now create cache dir and assign proper permission to proxy user

mkdir /cache1
chown proxy:proxy /cache1
chmod -R  777 /cache1

Now  initialize squid cache directories by

squid -z

You should see Following message

Creating Swap Directories

After this, start SQUID service by

service squid start

From Client end, point your browser to use your squid as proxy server and test any video.

TESTING YOUTUBE CACHING :-)

From client side, View any youtube video, after viewing, clear your cache and play it again (or from another pc) and you will notice that the video loads instantly,

Example:

You can monitor the CACHE TCP_HIT ENTRIES in squid logs, you can view them by

tail -f /var/log/squid/access.log | grep HIT

More information can be found at

http://aacable.wordpress.com/tag/aacable-howto-cache-youtube/

If you face “An error occured” in cached videos , see the following

http://aacable.wordpress.com/2012/01/30/youtube-caching-problem-an-error-occured-please-try-again-later-solved/

Regard’s

Syed Jahanzaib

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 664 other followers