DeepCrawl Update – Version 1.6

Christmas has come early this year with the  release of a new version of DeepCrawl which includes a range of fixes and features.

Redirect Chains & Max Redirects

We now display chains of multiple redirects and report when a chain contains more than 4 redirects which might not be followed by search engines.

In your existing 301 and 302 Redirects reports you will now see the first 4 URLs in a redirection chain. If any URL in the redirect chain is a 302 then the redirect will be categorised as a 302.

In the Validation Tab you will see a new report called Max Redirections which contains all the redirect chains with more than 4 redirects.

The number of Max Redirects allowed can be customised in the Report Settings for each project.

 

Issues List

The Issues List allows you to create lists of issues whilst you are reviewing your reports and then see a full list of all issues when you have finished which can be exported or shared.

In any report, click the Add Issue tab to bring up a pop-up form where you can create a new issue.

The form will save the report URL including any filters you have applied.

Once you have created an issue for a project, you can view it and all the issues by using the ‘All Issues’ tab which appears.

The re-test each issue which will take you to the latest report with your filters applied so you can see if it has been resolved. You can export and share the list of issues with clients or developers.

 

Support Form

We’ve added a Support Form if you have any questions or want to let us know about a problem. You can access the form via a Support Tab which is now visible on every screen.

This is the best way for you to get our attention quickly.

 

Recrawl project Link

You can now re-crawl any project with a single click from the Projects screen. The crawl will use the same settings as the previous crawl for that project.

 

Max Limit of 1000 Unique Links

The make the crawling process faster and more efficient, we now only track the first 1000 instances of a unique link. We define a unique link as having a unique target URL and anchor text. This affects sitewide links such as navigation in particular which was causing problems when some sites had over 1000 links in their main navigation.

In the Links In reports, you will only see the first 1000 instances of a unique link.

 

New Navigation Headings

The main navigation has changed slightly so it makes more sense for new users.

  • Scheduled Reports shows you the latest reports for projects with scheduled crawls configured.
  • One-off Reports shows you a list of the reports you’ve run for projects without scheduled crawls configured
  • All Projects shows you a list of every project in your account
  • Active Crawls shows you a list of running and paused crawls

 

That’s it for this release. We’d appreciate any feedback you have on these changes and let us know what you’d like to see in future releases.

 

Posted on December 20, 2012 at 11:48 am by chris · Permalink · Leave a comment
In: DeepCrawl

hreflang – Language and Region specific Search Results

Does the physical location matter at all?

With the introduction of the hreflang attribute we decided to check how it works in practice for different language and location settings. We tested the three following factors for two different languages (en/pl) and two locations (GB/PL):

  1. Physical location of the searcher (UK vs PL)
  2. Country specific Google version (google.co.uk vs google.pl)
  3. Language Settings (hl=en vs hl=pl)

Which gave us eight combinations (four for each physical location):

  1. Google PL / Language PL
  2. Google PL / Language EN
  3. Google UK / Language PL
  4. Google UK / Language EN

And here are the results:

Google PL / Language PL

http://www.google.pl/#hl=pl

As expected a search in Poland using google.pl with the browser language settings set to polish returns polish version of wikipedia. There is a very little difference between a search performed in Poland comparing to the one performed in United Kingdom though.

Physical geolocation PL Physical geolocation GB

Google PL / Language EN

http://www.google.pl/#hl=en

The following search was completed in Poland using google.pl but with browser settings set to english language and the results are completely different. There is a triple listing combining international, english and polish versions of Wikipedia.

Interestingly the searches performed in Poland and in United Kingdom are identical. The physical location of the searching user has not had any impact on the results.

Physical geolocation PL Physical geolocation GB

Google UK / Language PL

http://www.google.co.uk/#hl=pl

Regardless of the physical location of the searcher and despite using google.co.uk, the Polish version of wikipedia gets a higher ranking than the international and English version.

Physical geolocation PL Physical geolocation GB

Google UK / Language EN

https://www.google.co.uk/search?hl=en

Again the physical location a searcher didn’t have any impact on the search results.

Physical geolocation PL Physical geolocation GB

Conclusions

There are two main conclusions:

  • The physical location where a search takes place has very little if any impact on the search results.
  • Google seems to be prioritising the language of the browser ahead of the country version of Google.

We are more than welcome your thoughts and experiences.

Posted on October 21, 2012 at 10:18 pm by mm · Permalink · Leave a comment
In: Google, International SEO, Natural search, Site architecture

Google displaying message instead of description for robots.txt excluded files

We have just spotted today that Google started displaying a message instead of an empty description. This happens when you disallow a page in robots.txt preventing Google from indexing the content of a page and therefore the description as well. The message looks like this:

Google displaying message instead of description for robots.txt disallowed

The “learn more” link points to:
https://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449

 

It is worth noting that Google display the title if they know about it, otherwise URL is displayed instead.
URL as meta title

 

 

Posted on October 19, 2012 at 1:44 pm by mm · Permalink · Leave a comment
In: Google, Natural search, Robots.txt, Site architecture

The Best BrightonSEO Yet?

Brighton Pier

Fish’n'Chips the night before Brighton SEO

There was never any doubt that BrightonSEO was once again going to justify itself as the ‘must attend’ UK search conference last Friday (September 14th, 2012).  The conference itself  has grown massively in the last few years and the queue served as a visual testament as 100′s of excited ‘spammers’ gently weaved their way from the Royal Pavillion Gardens, towards the Brighton Dome which was again hosting the main event.  Such growth is thanks in no small part to the continued dedication, commitment and unbelievable energy levels displayed by BrightonSEO’s (self-proclaimed) chief cat herder/organiser/host/showman Kelvin Newman.

BrightonSEO

My Brighton SEO started with a  night of fish’n'chips with mushy peas (and beer) on the pier in the company of Richard Baxter, Lynne Murphy(who has the greatest Twitter handle) and the other conference speakers and sponsors.  Following a few additional ales at The Cricketers, I decided to head home as I really didn’t want to miss the opening talk from Dave Trott in the morning.  Fair play to Dom Hodgson though, he was heading out for more beers as I left, yet he still made it to the Dome early enough the next morning to (I believe) dress up and perform as a dancing Panda alongside his Penguin pal.

Dave Trott

Dave Trott – Predatory Thinking

At 9.30 on Friday morning I was undoubtedly the most excited SEO in the vicinity, having been eagerly anticipating Dave Trott’s talk on ‘Predatory Thinking’ since Kelvin told me he would be speaking.  Dave Trott is an advertising legend, a fountain of marketing knowledge, experience and creativity, a man whose work I had once studied.  He didn’t disappoint either – delivering his presentation in his native East Laandin accent; his tales and terminology (punters not customers) help you easily retain the underlying marketing principles.  For example the three primary components for delivering a message that get noticed, using an evening conversation with ‘the wife’ as an example:

  • Impact – “Cath!”
  • Communication – “Make us a cup of tea.”
  • Persuasion – “You make us a cuppa tea and I’ll take the bins out.”
Dave was quick to highlight how marketers all too often focus on the Persuasion component without any consideration for Impact or Communication, so the message fails – only when all three components are delivered does an advertisement work.  He mentioned the need to know two different languages in advertising, one for clients and another for punters.  Speak to your clients in a way that leaves them thinking ‘this guy wrote the book’ on marketing, whilst delivering a message to your consumers that leaves them thinking ‘they’re actually talking to me‘ – something I failed to deliver so seamlessly in my presentation through ’Know your audience’ – speak their language.
Dave Trott was undoubtedly the highlight of a great day for me.
Unfortunately I missed a few other talks during the morning session, although I did make it in to see Rebecca Weeks deliver her debut presentation about Chasing the Algorithm and Will Critchlow (who was standing in for the artificially intelligent Tom Anthony) discuss API’s and the future of search.
Richard Baxter

Richard Baxter – SEO Gadget

 

The afternoon sessions flew by at lightening fast speed, kick-started by Richard Baxter who (I’m pretty sure) was still writing his talk when I saw him rehearsing backstage.  Richard took the stage and outlined How to be a Better SEO, offering some really great advice to any SEO worth their weight, including…

  1. Get your comfort zone on a regular basis
  2. Learn how to pitch yourself
  3. Learn something new once a week
  4. Understand the perception people have of you
  5. Realise when you’re distracted and learn to refocus yourself

I missed his slides, and some of the tips, as I was waiting in the wings to tag team him for the stage, but  he left to a monster round of applause, no pressure then.

Brighton Dome

Brighton Dome
Re-filling after lunch

Why did I have to follow a seasoned pro like Richard!?  Thanks a bunch Kelvin.
I was blinded by the lights, nervous to be on stage at such a great venue as the Dome, where artists including Ziggy Stardust, Massive Attack, The Go Team and ABBA had previously performed.
I’d hinted to Pete Handley that my talk would involve music to some degree, but thankfully no Karaoke on this occasion, to the relief of the 1,500 strong audience.

I was here to preach the virtues of SEO Deliverance, or how to secure SEO changes at big brand websites, but to maintain a musical theme I had incorporated a selection of music artist names, song titles, lyrics and albums.  Hhhhhmmm, could that be enough ‘Impact’ that Dave Trott had referenced?  Maybe, although I forgot to ask the audience to raise a hand if ‘You’re not from Brighton’ (Fat Boy Slim) right from the start, oops.  So ‘What’s the story, morning glory?’ (Oasis), get the full low-down on delivering SEO change here.

Lynne Murphy was up next, but by the time I’d figured out how to escape from my lapel mike and navigated my way back to the auditorium through the backstage labyrinth, Lynne was half-way through her slides.  From what I could hear it was a real treat for the audience, a breath of fresh air for SEO’s who all too often get caught in the nitty-gritty technicalities of keyword research and digital marketing.  Hearing about the form and history of the English language was great – language is interesting, and Lynne an excellent speaker, fact.

James Little followed Lynne and swiftly enticed the crowd with an upper body striptease, risking life and limb as he revealed a Crystal Palace (arch rivals of Brighton & Hove FC) football strip beneath his smart white shirt.

Kelvin & Jason beat me! NOOOOO!

Beaten by the dynamic duo at SiteVisibility ;0)

Unfortunately the rest of my afternoon was somewhat fragmented, but I did manage to catch presentations from Aleyda Solis (Mobile SEO), Yousaf Sekander (Competitor analysis through Social Media), Jason Woodford (Changing the industry for the better) and Anna Lewis (Google analytics) – all of which were excellent in their own right.  To save me from re-inventing the wheel, take a look at the Silicon Beach summary of the afternoon sessions.So as the day turned into evening the BrightonSEO party kicked-off  in the Corn Exchange with a range of entertainment

including table football (on which Michal and @ToastedTeacake remain un-beaten!), roller cycling, a velcro human-fly wall, a selection of old skool games consoles and of course a bar (essential).  In additon, following the success of SEO Karaoke on the pier last time around, there was also a live band Karaoke stage ready for the budding singers and swingers, oh dear.  Justin Taylor provides an excellent summary of the evening elements on the Graphitas blog.

Thanks once again to Kelvin for what was a fantastic day, marred only by my inability to beat Jason and Kelvin on the roller cycle sprint!
Posted on September 21, 2012 at 4:50 pm by Tony · Permalink · 3 Comments
In: Analytics, Natural search, SEM Strategies, SEO Strategy, Social

SEO Deliverance at BrightonSEO

SEO What?  BrightonSEO
Deliverance. 
Oh, right.
Thanks.

When Kelvin asked me to prepare a presentation outlining how to deliver SEO change at a big brands it got me thinking about the unique position SEO’s find themselves in at large corporations. SEO Deliverance is my take on how to successfully integrate and manage the SEO tasks and processes necessary to deliver change at big brand sites.

Brighton Dome

Brighton Dome
Re-filling after lunch

So here I was, on stage at the glorious Brighton Dome, preparing to preach the virtues of SEO Deliverance to a crowd of
1,500 digital marketers, so ‘Where do I
begin?’ (Chemical Brothers)?
Music of course!
Brighton Dome is a superb music venue and just like the music industry (outside of Mr Cowell’s vice like grip), ‘search’ is a creative industry too, I therefore attempted to maintain a music theme throughout my presentation by including references to various artists, songs and albums etc.

To begin, I thought it best to define the meaning of deliverance (as detailed at www.thefreedictionary.com):

  1. a formal pronouncement or expression of opinion
    - In this instance, my opinion on how to successfully integrate and manage SEO tasks to deliver change at big brand sites.
  2. rescue from moral corruption or evil; salvation
    - Despite Google not being evil, all SEO’s seek salvation
  3. another word for delivery
    -  
    The objective
  4. rescue from bondage or danger
    - Don’t be seen as ‘Just another brick in the wall’ (Pink Floyd) within the corporate structure.

Working in a big brand SEO’s face an immediate danger of getting pigeon-holed in a single department (‘Dirty pigeons’ Blur – Parklife), bogged down with internal politics and endless sign-off processes, becoming invisible to the wider business.  As SEO’s, you need to ‘Be free, to do what you want, any old time.’ (Soup Dragons).

@RichardBaxter mentioned during his talk just before me that you should “learn how to pitch yourself” & “understand the perception people have of you”, I wholly agree and recommended that you position yourself as a Guru within the business, the man to go to, the man that can, the ‘Man, I feel like a woman’ (Shania Twain).  To do this you need to deliver something significant that gets you noticed within the business, a step change that  makes a difference and earns you the respect required to secure project sign-off (and delivery) in the future.

To achieve delivery of SEO change at big brands, you need to know and action the ’2Unlimited’ phases of Research, Development and Implementation:

 RESEARCH:

  • Know your market
  • Know your competition
  • Know your website
 DEVELOPMENT:

  • Know your objectives
  • Know your strategy
  • Know your limits
 IMPLEMENTATION:

  • Know your audience
  • Know your plan
  • Know your sh*t

 

Follow these phases and…
‘Know, know - know, know, know know - know know, know know - know know, there’s know limit’ to what you can deliver.  OK, I admit this was a stinker, but at least everyone knows it (excuse the pun).

The key take-outs from these nine ‘Knows’ are detailed below, and if you want to download the full presentation (minus a few images, sorry) it’s now available on SlideShare.

Know your market
  • Know your market through keyword research.
  • Assess the SERPs and the impact of universal search on the keywords you identify.
Know your competition
  • Audit the SERPs, who’s consistently competing for your target keywords in paid and organic channels?
  • Use Semetrical’s DeepCrawl tool to analyse your competitor’s website architecture.  Identify potential indexation, content and validation issues and areas of weaknesses – assess your opportunity.
Know your website
  • Review your site analytics and benchmark everything
  • Initiate an SEO audit
  • Run a comprehensive crawl of your website
  • Monitor site performance (FREE Robotto tool) and website changes through scheduled crawl analysis
  • Identify optimisation opportunities
Know your objectives
  • What is the website goal and what constitutes a conversion?
  • Have any specific objectives been set?
Know your strategy
  • What is the scope of the project?
  • What’s the capacity to deliver?
  • What are the costs?
  • Define the budget
  • Project the returns
Know your limits
  • Build and develop the central search team – you cannot do it all!
  • Identify departmental advocates, develop them as an extended search team
  • Maintain your teams knowledge – Read, Debate and Innovate:
    Read up in the morning, discuss at lunch and innovate (test stuff) in the afternoon
Know your audience
  • Who are you pitching to?
  • What are their objectives?
  • Consider your delivery method:
    If it’s for the CFO – prepare the numbers.
    If it’s the Marketing Director – highlight brand potential and creative opportunities.
Know your plan
  • Start with a business case – it’s all about the ROI!
  • Communicate the strategic goal – avoid jargon
  • Reference your research (data) to quantify the opportunity
Know your sh*t
  • Be prepared when it comes to the pitch:
  • Know your research
  • Know your numbers
  • Know your plan and believe in it

In essence, SEO Deliverance at big brands is all about preparation, planning and perfecting the pitch to secure project sign-off.  If you follow these simple process guidelines throughout the research, development and implementation stages, you will be able to prepare and deliver a solid business case and project plan that which leads to repeated sign-off and delivery quickly and efficiently.

Of course, sign-off  is just one small part of the bigger SEO picture, you still need to manage and deliver the actual website change, monitor performance and quantify the returns delivered but I’ll save that for another time.

Don’t forget to enter the Semetrical DeepCrawl Prize Draw which was announced at BrightonSEO.
Follow @Semetrical on Twitter and retweet the promotional tweet to enter before the end of September and you could win a DeepCrawl account of your very own!

Download the full SEO Deliverance presentation from @ToastedTeacake (minus a few images, sorry) at SlideShare.

Posted on September 19, 2012 at 12:34 pm by Tony · Permalink · 2 Comments
In: Natural search, SEM Strategies, SEO Strategy

Win a FREE DeepCrawl Account

That’s right!  We’re giving away a FREE DeepCrawl account to one of our Twitter followers as a way of saying ‘Thank you’.

To enter the prize draw and be in with a chance of winning a full (1 year) subscription, simply follow @Semetrical on Twitter  and re-tweet the following promotional tweet before the end of September 2012…

Want to WIN a FREE DeepCrawl #SEO tool? Follow @Semetrical & RT to enter. Visit: http://ow.ly/dGlWS for full details.

One of our lucky Twitter followers will be selected at random from all valid entries received before the closing date of 23:59 September 30th, 2012.  The prize draw will take place on Monday 8th October 2012 and the winner announced via the Semetrical blog.

The winning follower will benefit from a fully functional DeepCrawl account loaded up with enough credits to crawl up to 1,000,000 webpage URLs.

The promotional DeepCrawl account will be valid for 1 year, expiring on October 8th 2013.
Additional tokens can be purchased to top-up URL crawling capacity should they be exceeded within the first year.
Posted on September 14, 2012 at 2:30 pm by Tony · Permalink · Leave a comment
In: Uncategorized

DeepCrawl update – 1.5

We’re rolling out an update to DeepCrawl over the next week.

Here’s a summary of the changes which are mostly front-end interface improvements.

Monitoring Tab Redesign
The Website Monitoring tab which shows your scheduled crawls has been improved. It now displays mini-graphs for unique pages, duplicate pages, non-indexable pages and non-200 status codes to help give a better overview of any significant changes.

Share Reports
Every report now has a share URL near the downloads. This URL allows anyone to access the project without a login which is ideal for agencies to share with their clients.

Crawl Summary – Warnings/Notices/Improvements
This was a bit confusing so we now show you a prioritise list of issues with the totals and also the significant changes between crawl.

New Report – Duplicate Content
In addition to Duplicate Pages where the entire HTML is very similar, we have created an additional report under the Content tab which looks for high levels of duplication in the content excluding the HTML

Crawl Settings Stored
You can now see the specific crawl settings such as user agent on the crawl overview screen which wasn’t available before so sometimes you’d forget when settings were changed. The full details can be views by expanding the link.

New Styling
We’ve updated the styling to be consistent with other Semetrical products. It shouldn’t affect anything but gives the package a slightly cleaner feel.

We’re planning the features to include in the next release so don’t hesitate to let us know which features you’d like the most.

Posted on August 31, 2012 at 7:00 am by chris · Permalink · Leave a comment
In: DeepCrawl

How to Fix the Broken Webmaster Tools Crawl Error Report

A few weeks ago the link to download the Crawl Errors data in Webmaster Tools broke.

Google’s taking a long time to fix it but there is an easy way around it which we thought we would share. The download URL is malformed but easily corrected.

The URL looks something like this….

https://www.google.com/webmasters/tools/://:///webmasters/tools/crawl-errors-new-dl…

You just need to remove the duplicate paths to access the report. e.g.

https://www.google.com/webmasters/tools/crawl-errors-new-dl…

Posted on August 30, 2012 at 8:17 am by chris · Permalink · Leave a comment
In: Analytics, Webmaster Tools

The end of Site Wide Linking?

Site wide links have long been utilised by website groups as a way of doing some quick and pain-free link building.

Many SEOs have already realised their value was questionable but it takes a long time for new ideas to filter into other departments.

However we now have credible sources to prove they provide very little, if any.

Matt Cutts was interviewed recently by Danny Sullivan at SMX Advanced in Seattle and said…

We’ve done a good job of ignoring boilerplate, site wide links.

Bing also announced something similar in a recent blog post.

Site wide links often happen, and while they can be beneficial in terms of maybe driving direct traffic to your site, they are much less useful for organic ranking.  A site wide link is less of an endorsement than being mentioned in the body copy of the page.  

Whilst site wide links may provide some cross-brand exposure, don’t think they’re helping out with SEO performance.

Website groups are best minimising large volumes of links to home pages and try to generate as many relevant, varied and deep linking.

Posted on August 16, 2012 at 7:31 am by chris · Permalink · Leave a comment
In: Link building, SEO Strategy

Website Architecture Visualisations

Our site architecture tool, DeepCrawl, generates graphs visualising your site architecture by depth of links from the home page. We count the number of unique pages, duplicate pages, non-indexable URLs and non-200 URLs such as redirects and broken links.

Each graph is a different website we’ve analysed and you can immediately see which sites have what problems and how extensive they are.

Perfect Architecture

Almost entirely made up from indexable pages, peaking at level 4, fully crawled within 8 levels. This is the model for an efficient site architecture.

Pagination Long Tail

Fairly clean site with mostly unique pages but a problem with pagination architecture means new pages are still being discovered at level 35.

Internal Redirections and Long Tail

Some internal redirections are causing excess URLs and a poor linking architecture is resulting in a long tail of unique pages.

Heavy Canonicalisation

A huge number of additional URLs on level 4 which are all canonicalised to unique pages resulting in a very inefficient architecture to crawl.

Minor Canonicalisation

A fairly clean and well structured site architecture with all unique pages discoverable by level 7. Some additional canonicalised URLs is resulting in approximately 50% inefficiency.

Heavy Duplication

This site has a serious duplicate content URL issue which turns a few unique pages into a high volume of duplicate pages on unique URLs.

 

Posted on August 5, 2012 at 3:31 pm by chris · Permalink · Leave a comment
In: DeepCrawl