Google admins have pudgy fingers - postfix FCrDNS and valid helo names

clock May 8, 2012 21:51 by author Victor Ratajczyk |

Someone at google recently messed up the configuration of at least 40 of the outbound gmail and googlegroups mail servers. Google usually has perfectly configured mail servers. The google mail servers always use Forward Confirmed reverse DNS (FCrDNS). They also always use valid and publicly resolvable helo/ehlo host names, and those helo/ehlo host names always match the FCrDNS host name.

But from (at least) May 4th 2012 to May 8th 2012, the helo/ehlo host name was misconfigured on several dozen google mail servers. On these servers, the helo/ehlo does not match the FCrDNS and the provided helo/ehlo host name does not resolve on the internet. This configuration error had caused several ISPs to reject many emails from gmail users over the weekend of May 4th, 5th and 6th.

IP addresses affected appear to be 209.85.160.169 - 209.85.160.200 and 209.85.220.169 - 209.85.220.200

The fix:
You need to temporarily configure postfix to not "reject_unknown_helo_hostname" when the sender is gmail.com or google.com.

-----------

Postfix refresher - basic anti-spam features.
Postfix has many anti-spam features and rules. Some of the basic anti-spam rules are reject_invalid_helo_hostname, reject_unknown_helo_hostname, reject_non_fqdn_helo_hostname, reject_unknown_reverse_client_hostname and reject_unknown_client_hostname. Any or all of these rules could be applied to all incoming mail, but it is more common that some rules are applied selectively.

  • reject_unknown_reverse_client_hostname - rejects the email if the IP of the sending server does not have a reverse DNS (ptr) record.
  • reject_invalid_helo_hostname - rejects the email if the sending server uses invalid characters in the helo/ehlo host name.
  • reject_non_fqdn_helo_hostname - rejects the email if the sending server does not use a fully qualified domain name as the helo/ehlo host name. Bad: "mxsrv-13". Good: "mxsrv-13.foo.com"
  • reject_unknown_helo_hostname - rejects the email if the sending server uses a helo/ehlo host name that does not resolve to a public IP address. Bad: "mx1.foo.local". Good: "mx1.foo.local"
  • reject_unknown_client_hostname - rejects the email if the IP address and host name of the sending server does not have Forward Confirmed reverse DNS (FCrDNS), sometimes called full-circle DNS.

So what is FCrDNS?
You have forward confirmed reverse DNS when a "dig" or "nslookup" command returns results that agree with each other when looking up an IP and the resultant host name.
For example:
nslookup 209.85.220.210 returns mail-vc0-f210.google.com
nslookup mail-vc0-f210.google.com returns 209.85.220.210
So, we can be confident that 209.85.220.210 really is controlled by google.

-----------

Back to our story...
The largest email providers generally have excellect configurations on their mail servers. Providers such as aol, amazon, apple, comcast, cox, gmail, hotmail, road runner, etc always use FCrDNS and they use valid helo/ehlo host names that always match the FCrDNS host name. But facebook is slighlt different. While facebook always uses FCrDNS for the sending IP, they use a single (but still valid) helo/ehlo host name of "mx-out.facebook.com".

These ISP's use great configurations that help us to reject spoofed spam email. We can use the check_sender_access access table in postfix for one of the domains listed above. If the sender domain belongs to one of the domains listed above, we can conditionally call reject_unknown_helo_hostname and reject_unknown_client_hostname.

The regex access map would look smething like this:

/aol\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/amazon\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/(apple|mac)\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/comcast\.net$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/cox\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/facebook\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/(gmail|google)\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/(bing|hotmail|live|msn)\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname
/rr\.com$/ reject_unknown_helo_hostname,reject_unknown_client_hostname


But with the current situation at google,
/(gmail|google)\.com$/       reject_unknown_helo_hostname,reject_unknown_client_hostname

Needs to be changed to just:
/(gmail|google)\.com$/       reject_unknown_client_hostname

 

Here is a list of the affected IP addresses. As you can see, the helo host names are one character off from the FCrDNS host names, and are just a fat fingered typo.

Valid FCrDNSInvalid Helo (does not resolve)
mail-gh0-f169.google.com[209.85.160.169] mail-gy0-f169.google.com
mail-gh0-f170.google.com[209.85.160.170] mail-gy0-f170.google.com
mail-gh0-f171.google.com[209.85.160.171] mail-gy0-f171.google.com
mail-gh0-f172.google.com[209.85.160.172] mail-gy0-f172.google.com
mail-gh0-f173.google.com[209.85.160.173] mail-gy0-f173.google.com
mail-gh0-f174.google.com[209.85.160.174] mail-gy0-f174.google.com
mail-gh0-f175.google.com[209.85.160.175] mail-gy0-f175.google.com
mail-gh0-f176.google.com[209.85.160.176] mail-gy0-f176.google.com
mail-gh0-f177.google.com[209.85.160.177] mail-gy0-f177.google.com
mail-gh0-f178.google.com[209.85.160.178] mail-gy0-f178.google.com
mail-gh0-f179.google.com[209.85.160.179] mail-gy0-f179.google.com
mail-gh0-f180.google.com[209.85.160.180] mail-gy0-f180.google.com
mail-gh0-f181.google.com[209.85.160.181] mail-gy0-f181.google.com
mail-gh0-f182.google.com[209.85.160.182] mail-gy0-f182.google.com
mail-gh0-f183.google.com[209.85.160.183] mail-gy0-f183.google.com
mail-gh0-f185.google.com[209.85.160.185] mail-gy0-f185.google.com
mail-gh0-f188.google.com[209.85.160.188] mail-gy0-f188.google.com
mail-gh0-f192.google.com[209.85.160.192] mail-gy0-f192.google.com
mail-gh0-f197.google.com[209.85.160.197] mail-gy0-f197.google.com
mail-gh0-f198.google.com[209.85.160.198] mail-gy0-f198.google.com
mail-gh0-f199.google.com[209.85.160.199] mail-gy0-f199.google.com
mail-gh0-f200.google.com[209.85.160.200] mail-gy0-f200.google.com
mail-vc0-f169.google.com[209.85.220.169] mail-vx0-f169.google.com
mail-vc0-f170.google.com[209.85.220.170] mail-vx0-f170.google.com
mail-vc0-f171.google.com[209.85.220.171] mail-vx0-f171.google.com
mail-vc0-f172.google.com[209.85.220.172] mail-vx0-f172.google.com
mail-vc0-f174.google.com[209.85.220.174] mail-vx0-f174.google.com
mail-vc0-f176.google.com[209.85.220.176] mail-vx0-f176.google.com
mail-vc0-f177.google.com[209.85.220.177] mail-vx0-f177.google.com
mail-vc0-f178.google.com[209.85.220.178] mail-vx0-f178.google.com
mail-vc0-f179.google.com[209.85.220.179] mail-vx0-f179.google.com
mail-vc0-f180.google.com[209.85.220.180] mail-vx0-f180.google.com
mail-vc0-f181.google.com[209.85.220.181] mail-vx0-f181.google.com
mail-vc0-f182.google.com[209.85.220.182] mail-vx0-f182.google.com
mail-vc0-f183.google.com[209.85.220.183] mail-vx0-f183.google.com
mail-vc0-f185.google.com[209.85.220.185] mail-vx0-f185.google.com
mail-vc0-f188.google.com[209.85.220.188] mail-vx0-f188.google.com
mail-vc0-f192.google.com[209.85.220.192] mail-vx0-f192.google.com
mail-vc0-f195.google.com[209.85.220.195] mail-vx0-f195.google.com
mail-vc0-f196.google.com[209.85.220.196] mail-vx0-f196.google.com
mail-vc0-f197.google.com[209.85.220.197] mail-vx0-f197.google.com
mail-vc0-f198.google.com[209.85.220.198] mail-vx0-f198.google.com
mail-vc0-f200.google.com[209.85.220.200] mail-vx0-f200.google.com

 



Irrational exuberance, Apple vs the top 10 retailers

clock March 28, 2012 21:45 by author Victor Ratajczyk |

Irrational exuberance, Apple vs the top 10 retailers

What company would you rather own, Apple or Wal-Mart? By "own", I don't mean own a few shares. I mean, if you had $575 billion lying around and you could own the entire company, would you rather own Apple or Wal-Mart?

So you say that you would take Apple over Wal-Mart?
What if I throw in Home Depot too? How about I also throw in CVS, Costco, Target, Lowe's, Walgreens, Kroger (King Soopers), Best Buy and Sears (Kmart)?

On March 28th 2012 Apple shares closed at $617.62 per share, giving a market cap of $575.85 Billion. This is more than the combined market cap of the top 10 US retailers ($521.31B). 

 

Total sales of the top 10 retailers dwarf Apple's sales. Apple had $108 billion in sales. The top 10 retailers had 1.041 trillion (1,041 billion) in sales. 

 

Here are the yearly gross profits. Apple: $33.79B. Top10: $57.39B. 

 

Here are the yearly net profits. Apple: $25.92B. Top10: $34.06B. 

 

If you did have $575 billion burning a hole in your pocket, you could buy Apple, or you could buy Wal-Mart, Home Depot, CVS, Costco, Target, Lowe's, Walgreens, Kroger (King Soopers), Best Buy and Sears (Kmart)... Then still have $54.5 billion left over (for a rainy day).

 

Update, March 29th 2012
With your extra $54.5 billion, you could also pick up Macy's ($16.58B), AutoZone ($14.67B), Kohls ($11.91B), JC Penny ($7.81B) and Wendy's/Arby's ($1.93B). Even after these purchaases, you would still have $1.5 billion left over.

 



mysql fails to connect to localhost - disable ipv6 on windows

clock February 25, 2012 10:42 by author Victor Ratajczyk |

MySQL client fails to connect to localhost on Windows, due to IPv6
A MySQL client running on Windows 2008 or Windows 7 may fail to connect to the host name of "localhost". This is due to windows resloving the host name of localhost to the IPv6 loopback address of ::1. Windows will resolve localhost to ::1, even if IPv6 is disabled on all local network adapters.

MySQL IPv6 status
MySQL Server version 5.5.3 (March 2010) and later, support IPv6 connections to localhost, using the ::1 IPv6 address. So recent MySQL installations can at least listen for IPv6 client connections. But previous versions of MySQL server interpret ::1 as a string or host name, rather than an IP address.

On the client software side, things are not as simple.
We have the official MySQL Connector client libraries of connector/j, connector/odbc, connector/net, connector/c, connector/c++, connector/mxj, and the MySQL-PHP native driver (mysqlnd). On PHP we have ext/mysql, mysqli and pdo_mysql. ColdFusion uses its own MySQL client, with a different version of that client in each version of ColdFusion (mx, 7, 8, 9). Then there are many third party MySQL client libraries for Perl, ASP.Net, etc, etc, etc.

You just can't trust the client to support IPv6. Only a few MySQL client libraries currently support IPv6 connections, and of those that do support IPv6, only the most recent versions may properly support IPv6.

The localhost problem
So here is where we run into a problem...

By default, Windows 7 and Windows 2008 R2 resolve the localhost host name to the ::1 IPv6 loopback address, rather than the 127.0.0.1 IPv4 loopback address. Windows will resolve "localhost" to ::1, even if you have disabled IPv6 on all of the installed network addapters.

This doesn't just affect MySQL. This affects any TCP/IP client/server program that may use "localhost" as a connection parameter.

As you can see in the image below (mouse over images, for larger image) a "Ping localhost" command, on a default w2k8 r2 install, returns the IPv6 ::1 loopback address.

 

The images below show that ping localhost returns the ::1 IPv6 address, even when IPv6 is disabled on all local network adapters.


Fixing localhost IPv4 resolution

So we've decideded that we don't like this sneaky IPv6 result for localhost resolution. The software that we are running prefers that localhost resolve to trusty old 127.0.0.1.
What to do?

We can correct this behavior in the Windows Hosts file, or by using the Windows registry to modify how IPv6 works or disable IPv6 altogether. My personal preference is to use the Windows hosts file to specifically map localhost to 127.0.0.1 and to modify the IPv4/IPv6 resolution preference.

Via Windows Hosts file
First, let's take a look at the default Windows hosts file, at C:\Windows\System32\drivers\etc\hosts (open with notepad)
In previous versions of Windows, we find the following active entry for "localhost"
127.0.0.1       localhost

But in Windows 7 and Windows 2008 R2, the hosts file is effectively empty. While there are both IPv4 and IPv6 entries for "localhost", both are disabled by being commented out.
# 127.0.0.1       localhost
# ::1             localhost

The image below shows the default hosts file, on a clean install of w2k8 r2.



If both "localhost" entries are disabled (commented out), localhost will resolve to the ::1 IPv6 address.
If both "localhost" entries are enabled (active), localhost will still resolve to the ::1 IPv6 address.



If we only enable the 127.0.0.1 entry for localhost, then we get the proper IPv4 resolution for a ping of localhost.



Via Windows Registry
We can modify the behavior of Windows IPv6 from the "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\TCPIP6\Parameters\" registry key. The entry we are interested in is named "DisabledComponents". But the DisabledComponents entry probably doesn't exist on your system. So we need to create a new dword entry named "DisabledComponents".

You can now disable IPv6 support by setting the value of DisabledComponents to a hex value of ffffffff (that's 8 "f's"). After you reboot, a ping of localhost will return 127.0.0.1, even if you don't touch the hosts file.



But instead of completely disabling IPv6 support in Windows, you can just tell Windows to prefer IPv4 over IPv6. You can do this by setting the value of DisabledComponents to a hex value of 20. After you reboot, a ping of localhost will return 127.0.0.1, even if you don't touch the hosts file.




My personal preference is a belt and suspenders approach to make sure that localhost resolves to 127.0.0.1, while trying not to break other (or future) Windows features by ripping out IPv6. I set a 127.0.0.1 IPv4 address in the hosts file and I set DisabledComponents to a hex value of 20, to adjust the IPv4/IPv6 preference.






Windows IIS PHP file upload first character truncate

clock February 22, 2012 21:27 by author Victor Ratajczyk |

When uploading a file, PHP 5.3.7 and 5.3.8 may truncate the first character of the file name
I recently helped a friend track down a PHP problem on their sites. This person rents a dedicated server and hosts several dozen wordpress, joomla and opencart sites.

For the most part, everything worked fine on his sites. But he did have one odd problem. Any time he uploaded a file via a PHP form, the first character of the file name was cut off. If he were to upload "filefoo1.png", the system would truncate the first character and save the file as "ilefoo1.png". To deal with the problem, he was (successfully) simply prepending a letter to the name of each file that he uploaded.

To save "foofile2.jpg", he would upload it as "Qfoofile2.jpg".

As it turns out, this is due to a bug in specific versions of PHP. The affected versions are

  • PHP 5.3.7 (August 18th 2011)
  • PHP 5.3.8 (August 23rd 2011)

 

To fix this problem, upgrade to PHP 5.3.9 (January 10th 2012) or better.

 

 



Windows PHP IIS chmod and is_writable

clock February 16, 2012 20:32 by author Victor Ratajczyk |

Examining compatibility and behavior of the PHP chmod() and is_writable() functions on Windows
Compatibility of the PHP "is_writable()" function on the Windows platform is greatly improved, when compared to earlier versions of PHP. This article demonstrates that the PHP "is_writable()" function now works as expected on Windows. I will also demonstrate using the PHP chmod() function on Windows, to set and remove the read-only bit for a file.

Nearly every PHP application (except for Gallery) runs just fine on Windows with IIS and PHP. This has been true for quite a while now. PHP ran just fine on IIS, even before Microsoft released their fastcgi handler. Prior to the release of the fastcgi handler, I regularly configured PHP4 and PHP5 sites to run on IIS6. There were almost never any problems running the PHP applications on Windows and IIS. But where I would regularly see problems, was on the installation of PHP applications on IIS.

The is_writable() function on Windows
Until fairly recently, the PHP is_writable() function did not work as expected on Windows platforms. This situation was a constant headache for website administrators, PHP developers, and system admins alike. On Windows platforms, is_writable() would always return true, unless the file had the read-only attribute set. File ACL's were not checked by PHP. The file ACL's could be set in such a way as the web user could not write to the file, but is_writable() would still return true.

This problem could rear its head when an application such as Joomla or any other advanced PHP application is installed. Some applications may try to make sure that the site admin has properly secured the PHP app. The app might check to make sure that the "setup" directories have been deleted from the publicly accessible portion of the site. The app might also run is_writable() against the config file for the PHP app. The problem is, is_writable() would always return "true" on Windows.

The problem would not be corrected, even if the website admin used a control panel to change the file ACL, or asked the system administrator to fix the file ACL. The only solution was to set the file to read-only. Unfortunately, PHP application readme files rarely mentioned this behavior, and few Windows system admins knew about it.

This led many PHP users, developers, and even Windows system admins to think that many PHP apps were unreliable on Windows.

PHP is_writable() finally works
Tested on IIS 7.5 and the latest version of PHP 5.
Tested on this site.
This site is configured with an anon user name of "victorrweba".

Test 1, directory.
Check if the directory "D:/wwwsites/victor-ratajczykcom/www/examples/php/dir1/" can be written to.
Directory ACL for "victorrweba": Read, Write, Modify.
Command: is_writable('D:/wwwsites/victor-ratajczykcom/www/examples/php/dir1/');
Result: True.

Test 2, directory.
Check if the directory "D:/wwwsites/victor-ratajczykcom/www/examples/php/dir2/" can be written to.
Directory ACL for "victorrweba": Read. (Write, Modify have been removed)
Command: is_writable('D:/wwwsites/victor-ratajczykcom/www/examples/php/dir2/');
Result: False.

Test 3, files.
Check if the file "D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file1.txt" can be written to.
File ACL for "victorrweba": Read, Write, Modify.
Command: is_writable('D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file1.txt');
Result: True.

Test 4, files.
Check if the file "D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file2.txt" can be written to.
File ACL for "victorrweba": Read, Write. (Modify has been removed)
Command: is_writable('D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file2.txt');
Result: True.

Test 5, files.
Check if the file "D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file3.txt" can be written to.
File ACL for "victorrweba": Read. (Write and Modify have been removed)
Command: is_writable('D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file3.txt');
Result: False.

Test 6, files, read-only files.
Check if the file "D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file4.txt" can be written to.
In this test we leave the file ACLs at read, write, modify. But we set the read-only bit on the file. Here we see that even if the anon user has ACL permissions, the read-only bit locks the file down.

File ACL for "victorrweba": Read, Write, Modify.
Read-only bit: On
Command: is_writable('D:/wwwsites/victor-ratajczykcom/www/examples/php/files1/file4.txt');
Result: False.


So now you may be thinking to yourself "good for you Victor, PHP is_writable() works for you". But my host is on an older version of PHP. The only option I have, is to set the read-only bit on the file, and the only way to set the read-only bit on a file is to ask the host to do it for me, and it's a big hassle to do this every time I need to make a change to the config.php file.

 

The PHP chmod() function to the rescue
It is widely reported that the PHP chmod() command does not work on Windows. This is (partially) false.

The PHP chmod() function will not change the file ACLs on Windows.
The PHP chmod() function can not grant or revoke Read, Write, Execute privileges  to the anon user on Windows.
The PHP chmod() function on Windows does not work anything like the PHP chmod() funtion on Linux.

But the PHP chmod() function can do one important thing on Windows. The PHP chmod() function can turn the read-only bit on and off, for Windows files.
The web anon user does not need "modify" permissions to set the read-only bit. The web anon user only need "write' permissions to set or remove the read-only bit.

Set the read-only bit on a Windows file, via the PHP chmod() function
<?php
      chmod( "D:/wwwsites/examplecom/www/config/config.php", 0444 );
?>

Remove the read-only bit on a Windows file, via the PHP chmod() function
<?php
      chmod( "D:/wwwsites/examplecom/www/config/config.php", 0777 );
?>

 

Notes
Once you set the read-only bit on a file, you must try to remember that you have done so.
When the read-only bit is set, you will not be able to write to, delete or overwrite the file via FTP or the web app, until you remove the read-only bit.

 

 

 





block browsers bots and scrapers - with web config requestfiltering and user agent

clock February 13, 2012 12:07 by author Victor Ratajczyk |

User-agent filtering with web.config and requestFiltering
In this article I will show how to use web.config, requestFiltering and filteringRules to block browsers and robots, based on the user-agent string. Examples include blocking browsers such as Internet Explorer, Chrome, FireFox and Opera, and blocking search engine robots such as yandexbot, baiduspider, googlebot, yahoo, bing, etc.

Each time someone visits your site, their browser software identifies itself by sending a user-agent string. The user-agent string identifies the browser software and the browser version. The user-agent string sometimes also includes information on the operating system type, name, and version, as well as information about installed plug-ins. To help identify their crawlers, search engines (google, bing, yahoo, yandex, baidu, etc) also send user-agent strings when their software crawls your web site.

The requestFiltering section of a web.config file can be used to block specific browsers, bot, and spider user-agent's from visiting your web site. The web.config can be used to block a user-agent for an entire site, or on a directory by directory basis. User-agent blocking may be applied to all content, or to specific file types (.gif, .jpg, .php). When a blocked user-agent tries to access protected content, that user-agent will receive a 404 (file not found error).

Why it's done
There are many reasons why you may want to block certain user-agent strings. Some search engine spiders may be ignoring your robots.txt directives. Maybe an automated bot is regularly downloading all of your images. Maybe on inspection of your log files, you find that bot infected machines are regularly attacking your site, and they are using a specific user-agent. Maybe you are some sort of zealot who wants to block certain browsers or operating systems from your site.

How it's done

  • Use a text editor to create a file named web.config
  • Save the web.config file with the appropriate content
  • Place the web.config file in the directory where you wish to protect


Live examples


Other examples

  • Block the yandex search engine from php files on your site
    <?xml version="1.0"?>
    <configuration>
       <system.webServer>
          <security>
            <requestFiltering>
              <filteringRules>
                <!-- name the rule -->
                <filteringRule name="user agent deny" scanUrl="false" scanQueryString="false">
                  <scanHeaders>
                    <!-- apply rule to user-agent header -->
                    <add requestHeader="user-agent" />
                  </scanHeaders>
                  <appliesTo>
                    <clear />
                    <!-- only apply rule to php files -->
                    <add fileExtension=".php" />
                  </appliesTo>
                  <denyStrings>
                    <clear />
                    <!-- block the yandex bot -->
                    <add string="yandex" />
                  </denyStrings>
                </filteringRule>
              </filteringRules>
            </requestFiltering>
         </security>
       </system.webServer>
    </configuration> 
    
  • Block the yandex search engine from all files on your site
    <?xml version="1.0"?>
    <configuration>
       <system.webServer>
          <security>
            <requestFiltering>
              <filteringRules>
                <!-- name the rule -->
                <filteringRule name="user agent deny" scanUrl="false" scanQueryString="false">
                  <scanHeaders>
                    <!-- apply rule to user-agent header -->
                    <add requestHeader="user-agent" />
                  </scanHeaders>
                  <!-- apply rule to all files -->
                  <appliesTo />
                  <denyStrings>
                    <clear />
                    <!-- block the yandex bot -->
                    <add string="yandex" />
                  </denyStrings>
                </filteringRule>
              </filteringRules>
            </requestFiltering>
         </security>
       </system.webServer>
    </configuration>
    
  • Block a java based bot from accessing images
    <?xml version="1.0"?>
    <configuration>
       <system.webServer>
          <security>
            <requestFiltering>
              <filteringRules>
                <!-- name the rule -->
                <filteringRule name="user agent deny" scanUrl="false" scanQueryString="false">
                  <scanHeaders>
                    <!-- apply rule to user-agent header -->
                    <add requestHeader="user-agent" />
                  </scanHeaders>
                  <appliesTo>
                    <clear />
                    <!-- only apply rule to image files -->
                    <add fileExtension=".gif" />
                    <add fileExtension=".jpg" />
                    <add fileExtension=".png" />
                  </appliesTo>
                  <denyStrings>
                    <clear />
                    <!-- block the fake java img bot -->
                    <add string="java/1.4" />
                  </denyStrings>
                </filteringRule>
              </filteringRules>
            </requestFiltering>
         </security>
       </system.webServer>
    </configuration>
    
  • Block multiple search engine spiders, from all files, with a single rule
    <?xml version="1.0"?>
    <configuration>
       <system.webServer>
          <security>
            <requestFiltering>
              <filteringRules>
                <!-- name the rule -->
                <filteringRule name="user agent deny" scanUrl="false" scanQueryString="false">
                  <scanHeaders>
                    <!-- apply rule to user-agent header -->
                    <add requestHeader="user-agent" />
                  </scanHeaders>
                  <!-- apply rule to all files -->
                  <appliesTo />
                  <denyStrings>
                    <clear />
                    <!-- block the following bots -->
                    <add string="yandex" />
                    <add string="baiduspider" />
                    <add string="sogou" />
                  </denyStrings>
                </filteringRule>
              </filteringRules>
            </requestFiltering>
         </security>
       </system.webServer>
    </configuration>
    

 

Notes

  • Requires; Windows 2008 Server, IIS 7 (w2k8) or IIS 7.5 (w2k8 r2)
  • Requires: IIS sub feature: Request filtering (installed by default with IIS)
  • Requires: Request filtering delegation set to: Read/Write
  • The requestHeader name is not case sensitive <add requestHeader="not-case-sensitive" />
  • The denyStrings string text is not case sensitive <add string="not-case-sensitive" />
  • If the bot or browser doesn't send a user-agent string, none of this will help you.

 

 

 

 



web.config defaultDocument - use web.config to set a default page for a web site

clock February 11, 2012 23:11 by author Victor Ratajczyk |

web.config defaultDocument
The defaultDocument section of the web.config file can be used to set a custom default page (document) for your web site. You can also have a custom list of default documents for your website. If the first file listed does not exist, the web server will check the current directory for each file in the list.

The web.config can be used to change the default document (page) for an entire site, or on a directory by directory basis. The default page may be a .aspx, .asp, .htm, .html, .txt, or any other file type handled by the web server.

Why it's done
People typically type "foo.com" into their browsers, rather than "foo.com/index.aspx". When someone visits your website without specifying a page, the web server returns the default document. This also applies if someone visits a subdirectory on your site, such as foo.com/dir1/ or foo.com/dir2, but doesn't specify a page.

If there isn't a default document in the directory, the client will receive a "file not found" or "directory browsing denied" error. Web servers are typically configured to search for a list of default files. Depending on your configuration, the default document list in IIS 7.5 may include the files listed below.

  • default.aspx
  • default.asp
  • default.htm
  • index.asp
  • index.aspx
  • index.htm
  • index.html
  • index.php

Compatibility
The default document of web.config is compatible with IIS 7 (w2k8) and IIS 7.5 (w2k8 r2).

Web.config files are deeply integrated with IIS 7.x. While some web.config sections sometimes require that the containing directory is set as an application, this isn't one of them. A simple web.config with a defaultDocument section may be placed in any directory, and the directory does NOT need to be set as an application.

Example
Example default document list. Comments are enclosed in <!-- --> and are not required.

<!-- this line enables default documents for a directory -->
<defaultDocument enabled="true">
   <files>
       <!-- clear, removes the existing default document list -->
       <clear/>
       <!-- set foo.htm as the default document  -->                 
       <add value="foo.htm"/>
       <!-- set foo.php as the 2nd default document  -->
       <add value="foo.php"/>
       <!-- set foo.aspx as the 3rd default document  -->             
       <add value="foo.aspx/>
   </files>
</defaultDocument>

 

 

Using defaultDocument to change the default page

  • Use a text editor to create a file named web.config
  • Save the web.config file with the appropriate content
  • Place the web.config file in the directory where you wish to change the default page


Detailed web.config examples

If there isn't an existing web.config in the directory, your new web.config should look something like this

<?xml version="1.0"?>
<configuration>
  <system.webServer>
    <!-- this line enables default documents for a directory -->
    <defaultDocument enabled="true">
      <files>
       <!-- clear, removes the existing default document list -->
       <clear/>
       <!-- set foo.htm as the default document  -->                 
       <add value="foo.htm"/>
       <!-- set foo.php as the 2nd default document  -->
       <add value="foo.php"/>
       <!-- set foo.aspx as the 3rd default document  -->             
       <add value="foo.aspx/>
      </files>
    </defaultDocument>
   <modules runAllManagedModulesForAllRequests="true"/>
  </system.webServer>
</configuration>



If there is an existing web config, without a <system.webServer> section... Your new web.config should look like this

<?xml version="1.0"?>
<configuration>
   <system.web>
     <!-- .. existing text .. -->
     <!-- .. existing text .. -->
   </system.web>
   <system.webServer>
    <!-- this line enables default documents for a directory -->
    <defaultDocument enabled="true">
      <files>
       <!-- clear, removes the existing default document list -->
       <clear/>
       <!-- set foo.htm as the default document  -->                 
       <add value="foo.htm"/>
       <!-- set foo.php as the 2nd default document  -->
       <add value="foo.php"/>
       <!-- set foo.aspx as the 3rd default document  -->             
       <add value="foo.aspx/>
      </files>
    </defaultDocument>
   <modules runAllManagedModulesForAllRequests="true"/>
  </system.webServer>
</configuration>



If your existing web.config already has a <system.webServer> section, just add the <defaultDocument> section

<?xml version="1.0"?>
<configuration>
   <system.web>
     <!-- .. existing text .. -->
     <!-- .. existing text .. -->
   </system.web>
   <system.webServer>
   <!-- .. existing text .. -->
<!-- this line enables default documents for a directory -->
    <defaultDocument enabled="true">
      <files>
       <!-- clear, removes the existing default document list -->
       <clear/>
       <!-- set foo.htm as the default document  -->                 
       <add value="foo.htm"/>
       <!-- set foo.php as the 2nd default document  -->
       <add value="foo.php"/>
       <!-- set foo.aspx as the 3rd default document  -->             
       <add value="foo.aspx/>
      </files>
    </defaultDocument>
<!-- .. existing text .. -->
  </system.webServer>
</configuration>






 

 



Web.config httpRedirect - Redirecting individual pages with 301, 302, and 307 status codes

clock February 7, 2012 23:23 by author Victor Ratajczyk |

This article focuses on using web.config files to redirect browsers via a 301, 302, or 307 status code. This article details steps required to redirect individual pages to another page or site. This article is the third in a series. Part one explains what http redirect status codes are, and provides several example web.config files. Part two shows examples of web.config files in action.

Purpose
HTTP response redirect status codes are used to redirect web requests for a web site, directory, or page to another location. The redirect could target another page or directory on the same domain, or a page or directory on another domain. Response redirect status codes have many uses, but they are most often used after redesigning a web site, changing domain names, or merging two or more web sites.

Compatibility
The httpRedirect section of web.config is compatible with IIS 7 (w2k8) and IIS 7.5 (w2k8 r2).

Web.config files are deeply integrated with IIS 7.x. The httpRedirect directives listed in this article will apply to all files and directories (php, jpg, png, htm, etc), not just asp.net files.

While some web.config sections sometimes require that the containing directory is set as an application, this isn't one of them. A simple web.config with a httpRedirect section may be placed in any directory, and the directory does NOT need to be set as an application.

Prerequisites

  • Windows 2008 Server, IIS 7 (w2k8) or IIS 7.5 (w2k8 r2)
  • IIS sub feature: HTTP Redirection (not installed by default with IIS)
  • HTTP Redirection delegation set to: Read/Write

I already know
Yes, I know that we can redirect individual pages by placing the following into the head of the page
 <META HTTP-EQUIV=Refresh CONTENT="0; URL=http://www.foo.com/page.htm" />, but that's not what we are talking about. Here, we are using web.config to do the same thing. One advantage of using web.config for the redirect, is control over the status code. With web.config, we can set the status code to 301, 302, or 307.


Example
In the following example, the "pages" directory contains page1.htm, page2.htm, page3.htm, and page4.htm. The web.config shown below will do the following

 

<?xml version="1.0"?>
<configuration>
    <location path="page1.htm">
        <system.webServer>
            <httpRedirect enabled="true" destination="http://www.victor-ratajczyk.com/examples/webconfig/redir/newpages/newpage.htm" httpResponseStatus="Permanent" />
        </system.webServer>
    </location>
    <location path="page2.htm">
        <system.webServer>
            <httpRedirect enabled="true" destination="http://www.google.com" httpResponseStatus="Permanent" />
        </system.webServer>
    </location>
    <location path="page3.htm">
        <system.webServer>
            <httpRedirect enabled="true" destination="http://news.yahoo.com/science/" httpResponseStatus="Permanent" />
        </system.webServer>
    </location>
    <location path="page5.htm">
        <system.webServer>
            <httpRedirect enabled="true" destination="http://www.google.com" httpResponseStatus="Permanent" />
        </system.webServer>
    </location>
</configuration>