Articles - SEO and UXO guide contents - Next: geo-location - where are you exactly? What languages are you speaking? A search engine has arrived at your website. At some point during its indexing process, it will want to work out your geographical location even if only your country. Search engines are country specific and the results for google.co.uk, google.co.nz, google.fr and google.com will usually be different for the same search. Part of this is down to geo-location - the next topic in this guide - and part is down to the language or languages you offer on your website. Incidentally, as a visitor, it’s a good idea to use your country version to get country-specific results. If you use a generic (.com) version of a search engine, it may attempt to work out where you are from where you are currently logged on. So if you are in foreign parts and therefore attached to a foreign ISP, you may get unexpected results. Each web page should include a directive to indicate the natural language of the intended audience This is hidden or, in web jargon, meta data. How you set this will vary depending on whether you are writing code directly or using a content management system (eg WordPress, Joomla, Umbraco, Orchard). Whichever route you take, in the end the HTML tag <html> at the top of each page needs to have an attribute to show the language(s) of the intended audience. The language is indicated with a two-letter code. In the above example da corresponds to Danish. The hyphen and the two-letter code following it is optional and indicates that this is the variation of Danish spoken in Denmark. The latter part is called the region or locale. To find out what language codes are already in your page – and of course there may be none – get the page up in a browser and right-click (possibly ctrl-click on a Mac) and then left-click on View Source. The HTML tag should be at the top of the page. This is what you will see if the page is in British (aka real) English. <html lang="en-gb"> In the absence of a language directive, the search engine will guess your location from things like your hosting service’s address – UK, Ireland, USA, Singapore, wherever. If this ‘wherever’ is somewhere other than your own country, or even a region within your country, this could mean that you are ending up in the wrong country index or the wrong place in the index. Use a two letter code if your market is worldwide, include the locale if it’s national Remember that the pond gets a whole lot bigger for a worldwide market and your fish size has remained the same. You will not be excluded from the worldwide market by electing to include the locale, but you are likely to have a more visible presence in your smaller national pond. There is a principle that language tags should be kept as short as possible but the locale is ‘distinguishing information’ if you are operating primarily in your own country and that country isn’t the USA where global endings (.com, .org) are the norm. Should your page contain more than one language – something that applies to home pages with multiple languages before pointing the visitor to their language-specific pages – you can make the language directive a comma-separated string of codes: <html lang="en-gb,cy"> The above would be appropriate for an Anglo-Welsh page. In addition, the sections that are in English and the sections that are in Welsh should be individually marked in the body of the page. This is done by adding an attribute to the tag indicating the language. In very brief outline, using a <div> element for each section: <html lang="en-gb,cy"> <head>....</head> <body> <div lang="en"> ...English paragraphs here... </div> <div lang="cy"> ...paragraffau yma Cymraeg... </div> </body> The two-letter codes are defined by ISO 639 and it’s only when you wade through it you realise how many languages there are in the world. Here’s just a few: en - generic English, but not giving anything away in terms of language variant en-GB - British English: this indicates that your market is for UK citizens expecting prices in pounds sterling and dates in date/month/year format cy - any Welsh speaker fr - any French speaker fr-FR - French as spoken in France fr-CA - French as spoken in Canada fr-GB - French as spoken in Britain This last one might seem a bit weird but if someone were targeting the 400,000 French living in London who expect to buy things in pounds sterling, this is different from selling stuff to the French in France or Francophone Canada where the prices are in Euro and Canadian dollars respectively – as well as there being different vernacular. Although the locale is shown in upper case above, search engines and browsers will ignore capitalisation, so if you don’t like the shouty effect of capitals, use lower case. Don’t be tempted to fill the language directive with every language code under the sun This is not making it easier for the search engines and they may resort to other, sub-optimal, methods of working out what languages are on the page. Some pages in your website may need different language directives from the site as a whole. If you are offering your website in different languages or varieties of language, do change the language code as appropriate. Even English websites might have UK and US versions of pages to account for different vernacular – the British take ‘holidays’ while the Americans take ‘vacations’; a courgette becomes zucchini in American; a wedding ring is a wedding band; a merry-go-round is a carousel; car bonnets and boots become hoods and trunks; and so forth. Using UTF-8 character encoding simplifies the use of accents There is no business case for changing the character encoding of your website unless some people are seeing little square boxes where characters should be or funny renditions of certain characters eg £ instead of plain £. If so, then you have a UXO issue. The most common problem we have encountered with unexpected characters on a website is because the physical file format is ANSI and the web page encoding is declared as utf-8. The solution is to convert the page from ANSI to utf-8 encoding using something like Notepad++. In the event of a problem, there are three places where you may need to sort things out: the web server – you may or may not be able to change this; the web page – you should be able to set this; the web page file – again, you should be able to change this. Sorting out encoding will almost certainly require expert help because of things that can unexpectedly affect it. For example: the editor you are using to create the pages might automatically save files with a particular encoding there can be differences between PCs and Macs in the file’s BOM (Byte Order Mark), but your server may require no BOM at all. To see what encoding you are using, or if one is specified at all, open a page in your website and go to View Source. In the <head> section there may be a line: <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> If this line exists at all, it might have some value other than iso-8859-1. If it’s utf-8 then all should be well and the problem will lie with the file encoding or the web server. If you do decide to have a go at fixing the problem yourself, first make a backup of your website so that you can revert to a perhaps not brilliant, but at least no worse, position. Incidentally, the character set declaration should be placed immediately after the <head> tag because once a search engine or browser discovers this line, it may have to restart reading the page with the encoding that it’s just found, throwing away anything it has read so far. Top - Contents - Next: geo-location - where are you exactly Tweet 2023 © Caz Limited