Archive

Archive for the ‘General’ Category

New .co, .net.co, .com.co domains

July 29th, 2010 Comments off
{lang: 'en-GB'}

The .co domain has arrived!

Colossal new domain extension .co has just become available. The country code for Colombia, .co is also a great choice for company, commerce or community websites.

Demand for .co domains is sky high, but hopefully your preferred name is still available for registration. So don’t hang about, secure your .co – before someone else does.

Find your .co domain name now >

DiggDeliciousFacebookGoogle BookmarksBeboFriendFeedGoogle ReaderAIMBlipGoogle GmailLinkedInRedditMySpaceYahoo MailYahoo BookmarksShare
Categories: General Tags:

mod_rewrite: A Beginner’s Guide to URL Rewriting

July 24th, 2010 Comments off
{lang: 'en-GB'}

So you’re a Web developer who has all the bells and whistles on your site, creates Web-based applications that are both beautiful and work well. But what about these issues?

Applications Must Be Safe

A user must not be able to harm your site in any way by modifying a URL that points to your applications. In order to ensure your site’s safe, check all the GET variables coming from your visitors (We think it’s trivial to mention that the POST variables are a must to examine).

For example, imagine we have a simple script that shows all the products in a category.o Generally, it’s called like this:

app.php?target=showproducts&categoryid=123

But what will this application do if ScriptKiddie(tm) comes and types this in his browser:

app.php?target=showproducts&categoryid=youarebeinghacked

Well, many of the sites I’ve seen will drop some error message complaining about use of the wrong SQL query, invalid MySQL resource ID, and so on… These sites are not secure. And can anyone guarantee that a site-to-be-finished-yesterday will have all the parameter verifications –even in a programmer group having only 2 or 3 people?

Applications Must Be Search-Engine Friendly

It’s not generally known, but many of the search engines will not index your site in depth if it contains links to dynamic pages like the one mentioned above. They simply take the “name” part of the URL (that’s everything before the question mark, which contains the parameters that are needed for most of the scripts to run correctly), and then try to fetch the contents of the page. To make it clear, here are some links from our fictitious page:

app.php?target=showproducts&categoryid=123
app.php?target=showproducts&categoryid=124
app.php?target=showproducts&categoryid=125

Unfortunately, there’s a big chance that some of the search engines will try to download the following page:

app.php

In most cases calling a script like this causes an error – but if not, I’m sure it will not show the proper contents the link was pointing to. Just try this search at google.com:

“you have an error in your sql syntax” .php -forum

There are both huge bugs and security in the scripts listed — again, these scripts are not search-engine friendly.

Applications must be user-friendly

If you application uses links like:

http://www.down.com/?category=34769845698752354


 

then most of your visitors will find it difficult to get back to their favourite category (eg. Nettools/Messengers) every time they start from the main page of your site. Instead, they’d like to see URLs like this:

http://www.down.com/Nettools/Messengers

It’s even easier for the user to find (pick) the URL from the browsers’ drop-down list as they type into the Location field (though of course this only works if the user has visited that previously).

And what about you?

Now you have everything you need to answer the following questions:

  • Is your site really safe enough?
  • Can you protect your site from hackers?
  • Are your Websites search-engine compatible?
  • Are the URLs on your site ‘user friendly’ – are they easy to remember? …and would you like it to be?
What is the mod_rewrite Solution, Exactly?

But what does it exactly do? Hey! Here comes the whole point of this article!

mod_rewrite catches URLs that meet specific conditions, and rewrites them as it was told to.

For example, you can have a non-existing

http://www.mysite.co.uk/anything

URL that is rewritten to:

http://www.mysite.com/deep/stuff/very_complicated_url?text=

having_lots_of_extra_characters


 

Did you expect something more? Be patient…

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^/shortcut$ /complicated/and/way/too/long/url/here
</IfModule>

Of course this, too, should go into the .htaccessttpd.conf file.

After you restart Apache (you’ll get used to it soon!) you can type this into your browser:

http://mysite.co.uk/shortcut

If there’s a directory structure /complicated/and/way/too/long/url/here existing in your document root, you’re going to be “redirected” there, where you’ll see the contents of this directory (eg, the directory listing, index.html, whatever there is).

To understand mod_rewrite better, it’s important to know that this is not true redirection. “Classic” redirection is done with the Location: header of the HTTP protocol, and tells the browser itself to go to another URL. There are numerous ways to do this, for example, in PHP you could write:

<?
// this PHP file is located at http://localhost/shortcut/index.php
header
("Location: /complicated/and/way/too/long/url/here");
?>

This code shows the same page by sending a HTTP header back to the browser. That header tells the browser to move to another URL location instantly. But, what mod_rewrite does is totally different: it ‘tricks’ the browser, and serves the page as if it were really there – that’s why this is an URL rewriter and not a simple redirector (you can even verify the HTTP headers sent and received to understand the difference).

But it’s not just shortening paths that makes mod_rewrite the “Swiss Army Knife of URL manipulation”…

Rules

You’ve just seen how to specify a really simple RewriteRule. Now let’s take a closer look…

RewriteRule Pattern Substitution [Flag(s)]

RewriteRule is a simple instruction that tells mod_rewrite what to do. The magic is that you can use regular expressions in the Pattern and references in the Substitution strings. What do you think of the following rule?

RewriteRule /products/([0-9]+) /siteengine/products.php?id=$1

Now you can use the following syntax in your URLs:

http://mysite.co.uk/products/123

After restarting Apache, you’ll find this is translated as:

http://mysite.co.uk/siteengine/products.php?id=123

If you use only ‘fancy’ URLs in your scripts, there will be no way for your visitor to find out where your script resides (/siteengine in the example), what its name is (products.php), or what the name of the parameter to pass (productid) is! Do you like it? We’ve just completed two of our tasks, look!

  • Search-engine compatibility: there are no fancy characters in the URL, so the engines will explore your whole site
  • Security: ScriptKiddie(tm)-modified URLs will cause no error, as they’re verified with the regular expression first to be a number – URLs with no proper syntax can’t even reach the script itself.

Of course, you can create more complex RewriteRules. For example, here’s a set of rules:

RewriteRule ^/products$ /content.php
RewriteRule ^/products/([0-9]+)$ /content.php?id=$1
RewriteRule
^/products/([0-9]+),([ad]*),([0-9]{0,3}),([0-9]*),([0-9]*$)
/marso/content.php?id=$1&sort=$2&order=$3&start=$4

Thanks to these rules we can use the followings links in the application:

  • Show an opening page that contains product categories: http://somesite.co.uk/products
  • Product listing, categoryid is 123, page 1 (as default), default order: http://somesite.co.uk/products/123   http://somesite.co.uk/products/123,,,,
  • Product listing, categoryid is 123, page 2, descending order by third field (d for descending, 3 for third field): http://somesite.hu/products/123,d,3,2

This is also an example of the use of multiple RewriteRules. When there’s a RegExp match, the proper substitution occurs, mod_rewrite stops running and Apache serves the page with the substituted URL. Should there be no match (after processing all the rules), a usual 404 page comes up. And of course you can also define one or more rules (eg. ^.*$ as last pattern) to specify which script(s) to run depending on the mistaken URL.

The third, optional part of RewriteRule is:

RewriteRule Pattern Substitution Flag(s)

With flags, you can send specific headers to the browser when the URL matches the pattern, such as:

  • forbidden‘ or ‘f‘ for 403 forbidden,
  • gone‘ or ‘g‘ for 410 gone,
  • you may also force redirection, or force a MIME-type.

You can even use the:

  • nocase‘ or ‘NC‘ flag to make the pattern case-insensitive
  • next/N‘ to loop back to the first rule (‘next round‘ — though this may result in an endless loop, be careful with it!)
  • skip=N/'S=N‘ to skip the following N rules

…and so on.

We hope you feel like we’ve felt while playing around with this module for the first time!

Conditions

But that’s not all! Though RewriteRule gives you an opportunity to have professional URL rewriting, you can make it more customized using conditions.

The format of the conditions is simple:


 

RewriteCond Something_to_test Condition

Any RewriteCond condition affects the behaviour of the following RewriteRule, which is a little confusing, as RewriteCond won’t be evaluated until the following RewriteRule pattern matches the current URL.

It works like this: mod_rewrite takes all the RewriteRules and starts matching the current URL against each RewriteRule pattern. If there’s a RewriteRule pattern that matches the URL, mod_rewrite checks if there are existing conditions for this RewriteRule, and if the first one returns true. If it does, the proper substitution will occur, but if not, mod_rewrite looks for remaining conditions. When there are no more conditions, the subsequent RewriteRule is checked.

This way you can customize URL rewriting using conditions based on practically everything that’s known during a HTTP transfer in Apache — and a lot more! Basically you can use all of these variables in the Something_to_test string:

  • HTTP header variables: HTTP_USER_AGENT, HTTP_REFERER, HTTP_COOKIE, HTTP_FORWARDED, HTTP_HOST, HTTP_PROXY_CONNECTION, HTTP_ACCEPT
  • Connection & request variables: REMOTE_ADDR, REMOTE_HOST, REMOTE_USER, REMOTE_IDENT, REQUEST_METHOD, SCRIPT_FILENAME, PATH_INFO, QUERY_STRING, AUTH_TYPE
  • Server internal variables: DOCUMENT_ROOT, SERVER_ADMIN, SERVER_NAME, SERVER_ADDR, SERVER_PORT, SERVER_PROTOCOL, SERVER_SOFTWARE
  • System variables: TIME_YEAR, TIME_MON, TIME_DAY, TIME_HOUR, TIME_MIN, TIME_SEC, TIME_WDAY, TIME
  • mod_rewrite special values: API_VERSION, THE_REQUEST, REQUEST_URI, REQUEST_FILENAME, IS_SUBREQ

The condition can be a simple string or a standard regular expression, with additions like:

  • <, >, = simple comparison operators
  • -f if Something_to_test is a file
  • -d if Something_to_test is a directory

As you can see, these are more than enough to specify a condition like this one (taken from the mod_rewrite manual):

RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*
RewriteRule ^/$ /homepage.max.html [L]
RewriteCond %{HTTP_USER_AGENT} ^Lynx.*
RewriteRule ^/$ /homepage.min.html [L]
RewriteRule ^/$ /homepage.std.html [L]

When a browser requests the index page, 3 things can happen:

  • browser with a Mozilla engine the browser will be served homepage.max.html
  • using Lynx (character-based browser) the homepage.min.html will open
  • if the browser’s name doesn’t contain ‘Mozilla’ nor ‘Lynx’, the standard homepage.std.html file will be sent

You can even disable users from accessing images from outside your server:

RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://localhost/.*$ [OR,NC]
RewriteCond %{HTTP_REFERER} !^http://mysite.co.uk/.*$ [OR,NC]
RewriteCond %{HTTP_REFERER} !^http://www.mysite.co.uk/.*$ [OR,NC]
RewriteRule .*\.(gif|GIF|jpg|JPG)$ http://mysite.co.uk/images/bad.gif [L,R]

But of course, there are endless possibilities, including IP- or time-dependant conditions, etc.

For Advanced Users

We mentioned user-friendliness in the introduction, and haven’t dealt with it. First, let’s imagine we’re having a huge download site that has the downloadable software separated into categories, each with a unique id (which is used in the SQL SELECTs). We could links like open.php?categoryid=23487678 to display the contents of a category.

To ensure that our URLs were easily memorized (eg. http://www.downloadsite.com/Nettools/Messengers) we could use:


 

RewriteRule ^/NetTools$ /test.php?target=3
RewriteRule ^/NetTools/Messengers$ /test.php?target=34

assuming the ID is 3 for the NetTools category and 34 for Messengers subcategory.

But our site is huge, as we’ve mentioned – who wants to hunt down all the IDs from the database, and then edit the config file by hand? No-one! Instead, we can use the mapping feature of mod_rewrite. Map allows us to provide a replacement-table – stored in a single text file — within a hash file (for fast lookups), or even served through an external program!

For better performance I’d generate a single text file using PHP, which contains the following:

NetTools            3
NetTools/Messengers 34
.
.
.
and so on.

The .htaccess file would contain:

RewriteMap categories txt:/path/to/file/categoryids.txt
RewriteRule ^(.*)$ open.php?categoryid=${categories:$1|0}

These lines tell mod_rewrite to read the categoryids.txt file upon Apache startup, and provide the ID for the URL for open.php. The |0 means that categoryid will be 0 if there’s no matching key in the textfile.

You can also choose to serve the IDs on-the-fly via a script or other executable code. The program is started by Apache on server startup, and runs until shutdown. The program must have buffered I/O disabled, read from the stdin, and write results to stdout — it’s that simple!

With RewriteMap you can do a lot more, including:

  • load balancing through servers (using rnd:),
  • creation of a Webcluster that has an homogenous URL layout,
  • redirection to mirror sites without modifying your Web application,
  • denial of user access based on a hostlist,

and so on.

Tips, Tricks and Advice
  1. Before using mod_rewrite in a production server, I’d recommend setting up a testserver (or playground, whatever you prefer to call it).
  2. During development, you must avoid using ‘old-fashioned’ URLs in your application.
  3. There might still be need to verify data passed through the URL (passing non-existing — too large or small – IDs, for example, might be risky).
  4. Writing ‘intelligent’ RewriteRules saved me coding time and helped me write simpler code. Using error_reporting(E_ALL); everywhere (and we recommend it!), but I find it boring to do the following for the ten thousandth time:if (isset($_GET['id']) && (validNumber($_GET['id']))
    if (isset($_GET['todo']) && ($_GET['todo']=='deleteitem'))

    The following trick helped to get rid of the extra isset() expression by providing all the needed parameters each time in the RewriteRules:

    RewriteRule ^/products/[0-9]+$ products.php?id=$1&todo=

    I know, I know it’s not the answer to the meaning of life — but it’s hard to show how nice and clear a solution this might provide in such a short example.

Finally…

That’s all for our ‘brief’ overview of mod_rewrite. After you’ve mastered the basics, you’ll find you can easily create your own rules. If you like the idea of URL rewriting, may want to play with mod_rewrite – some ideas follow (note that the underlying PHP code is not important in this case):

http://www.mysite.co.uk/1/2/3/content.html
=> 1_2_3_content.html

http://www.mysite.co.uk/1/2/3/content.html

=> content.php ? category=1

http://www.mysite.co.uk/1/2/3/

=> content.php ? category=1 & subcat1 = 2 & subcat2 = 3

http://www.mysite.co.uk/1/2/3/details

=> content.php ? category=1 & subcat1 = 2 & subcat2 = 3

http://www.mysite.co.uk/bookshop/browse/bytitle

=> library.php ? target=listbooks & order = title

http://www.mysite.co.uk/bookshop/browse/byauthor

=> library.php ? target=listbooks & order = author

http://www.mysite.co.uk/bookshop/product/123

=> library.php ? target=showproduct & itemid=123

http://www.mysite.co.uk/bookshop/helpdesk/2
=> library.php ? target=showhelp & page=2

http://www.mysite.co.uk/bookshop/registration

=> library.php ? target=reg

DiggDeliciousFacebookGoogle BookmarksBeboFriendFeedGoogle ReaderAIMBlipGoogle GmailLinkedInRedditMySpaceYahoo MailYahoo BookmarksShare

How to find search engine love – hot tips to improve your ranking

June 26th, 2010 Comments off
{lang: 'en-GB'}

Every website needs to woo search engines like Google and Yahoo. But forget boxes of chocolates, and expensive dinner dates, search engines are only interested in… optimisation.

Search Engine Optimisation (SEO) is the art of courting search engines. Yes, they can be the most elusive of prospects. But don’t get yourself into a flush; there are plenty of ways to win search engine’s eyes.

So here’s our quick and easy guide to becoming a true search engine aficionado.

Speak the language of love

How do you charm a search engine? Well, you can’t just whisper sweet little nothings in its ear – that won’t work. Instead, search engines coo to the tones of something called metadata, and thankfully this sliver-tongued art is relatively easy to master.

Metadata appears at the top of a HTML page (in between the <head> & </head> tags), and its here that the keywords, which can melt search engine hearts, reside. Simplicity that keywords should be relevant to your site’s name, theme and purpose. But, as so often in affairs of the heart, things can get complicated. So here are some handy SEO tips for keywords:

1)    Firstly, establish which words people use when searching for your type of site.
2)    Try to use keywords your competitors haven’t used.
3)    Think about whether your site has a unique niche which it can exploit, such as a service or location.
4)    Remember, people aren’t machines – they type all sorts of phrases when searching online, so it pays to think creatively.
5)    Also, use keywords for your link text – never use ‘click here’.

Create the right impression on dates

Search engines rank sites by periodically assessing their content. This is kind of like an infinitely recurring first date, so your site needs to spruce up for every fresh encounter. Search engines hate dates when the conversation dries up, or becomes repetitive. That’s why you should never duplicate your site content from another site. Always create original material and endeavor to create fresh content as often as possible (at least once a month).

It can be tacky, but let’s face it – sometimes you need to splash the cash, and search engines are very easily impressed by a big spender who picks up the bill on a date. So make sure you buy all the available Top Level Domain (TLD) extensions (i.e. .co.uk, .com, .info) for your brand name. But just for good measure, here are a few more FREE ways to flirtr with search engines:

1)    Use unique and relevant titles on every page of your site.
2)    Validate your sites, so spiders can crawl through faster.
3)    Keep your pages under 1 kilobyte.

Hopefully this post will help you along the often rocky road to search engine romance. Please let us know if you’ve got any SEO tips you’d like to share with your fellow readers.

DiggDeliciousFacebookGoogle BookmarksBeboFriendFeedGoogle ReaderAIMBlipGoogle GmailLinkedInRedditMySpaceYahoo MailYahoo BookmarksShare

Meet the Web Server

June 26th, 2010 Comments off
{lang: 'en-GB'}

There is a lot that goes into running a web hosting business.  The provider needs an internet connection, bandwidth and data facility to store the equipment that enables the service.  While numerous components are required, almost all of them revolve around the web server.

What is a Web Server?

There term web server actually describes to different elements.  One is the computer that stores the data for websites.  The other is a software application that runs on the computer and processes requests from web browsers and other client-side technologies.  Though often used interchangeably, these two components are quite different.  For this reason, one should always clarify the mentioning of a web server as it can refer to either a machine or an application.

The Web Server in Action

A web server application helps the actual hardware serve web pages upon the request of a browser such as Internet Explorer or Opera.  Because it deals primarily in HTTP (Hypertext Transfer Protocol) requests, this type of application if often referred to as an HTTP server.  After receiving a request, the server runs HTTP, which is a protocol for transferring data over the internet and enabling two computers to communicate with each other.  When using your web browser to access any given website, a request is transmitted to a web server on a remote computer.  The server application then processes the browser’s request and attempts to locate the requested web page.  If it is found, the server sends the page to your browser which then displays the appropriate content.

Commonly Used Web Servers

The Netcraft Web Server Usage Survey reports that the Apache HTTP server is the most the widely installed web server in the world, claiming that it has nearly 60% of the market share.  As an open-source application, Apache supports numerous open-source technologies such as the Linux operating system and MySQL database server.

Using a Web Server off the Web

While primarily intended for the web hosting arena, web server applications can also be used for other purposes as well.  For example, many techies have the Apache server installed on their Windows-based computers.  This is great for someone who scripts custom programs for their own servers.  Several developers find this method much easier than working on a remote server.   So, if you have a powerful computer with a need to create PHP scripts, a web server like Apache could work wonders on your system.

DiggDeliciousFacebookGoogle BookmarksBeboFriendFeedGoogle ReaderAIMBlipGoogle GmailLinkedInRedditMySpaceYahoo MailYahoo BookmarksShare
Categories: General Tags: , ,

Avoiding Common Web Hosting Traps

June 26th, 2010 Comments off
{lang: 'en-GB'}

Web hosting providers come a dime a dozen but landing a good one isn’t always easy.  There are many choices out there and some overwhelm you with so much glamor that it becomes pretty easy to get dazzled by their marketing techniques.  This article will discuss several crucial web hosting traps that you need to look out for.

Amazingly Low Price

It’s true – web hosting is very affordable, so much that “cheap hosting” has become one of the most highly sought after offerings on the market.  Unfortunately, some providers have to make substantial sacrifices in order to drop the price of their service.  Whether it’s overloading the server with customers or cutting back on support, it all affects the overall quality of service and could leave you with major issues.  Not all cheap web hosting packages are of poor quality but because several are, you need to be very weary of an amazingly low price.

Limitations

Whether it’s for personal or business matters, your website is almost certain to expand.  Over time, you will need to upload more files, possibly add new software and hopefully receive more traffic.  Don’t go after the first web hosting deal you run across or one that only offers enough to support you for the first couple of months.  Instead, check out a variety of hosts and focus on those with features that allow your site to grow.  Sometimes, a provider’s one-size-fits all hosting package isn’t the solution you need.

Unprofessional Site

A web host that comes with all the bells and whistles may warrant the red flag but so should a company with an underdeveloped website.  It doesn’t have to be glamorous, but a respectable web hosting provider must look the part.  They should have a professional appearance with a site that is simple to navigate, making all the essentials easy accessible.  You should have no trouble locating the FAQ page and especially an email address or phone number.

Incredible Claims

Most web hosting companies advertise a 99.9% up time guarantee, which is to ensure that your website is up at least 99.9% of the time over a given month.  Sadly, many hosts are just advertising this percentage instead of upholding it.  A host that really lives up to its up time guarantee will offer money back, credit or similar items if they are not able to deliver as promised.  This is actually a rarity and if you don’t know you’re supposed to be compensated, your site will just be down until the issues are worked out behind the scenes.  This could be hours or days.

Questionable Business Practices

You should strongly consider avoiding any provider with complex pricing schemes and other questionable methods.   For example, the price might be advertised as £0.99 per month but you may have to lock into a three-year contract in order to get that price.  Although this is a common practice, you need to make yourself aware of situations where promises turn out to be tricks. At Laws Hosting, you receive what you see on our website – no gimmicks.

Non-existent Support

A web hosting provider’s approach towards support can tell you a lot about their approach to the overall service.   When you call or email with questions, they should be more than happy to help, or at least act like it.  Don’t just fall for their 24/7 support claim.  Put them to the test to get an idea of how they will react when you really need help.

DiggDeliciousFacebookGoogle BookmarksBeboFriendFeedGoogle ReaderAIMBlipGoogle GmailLinkedInRedditMySpaceYahoo MailYahoo BookmarksShare
Categories: General, Tips & Tutorials Tags: