There are almost certainly duplicate pages on your website, and indeed on nearly all websites. This is because a page can have many different URLs. How? There are many ways, but the main one is the ‘query string’, the stuff you’ll see added to the end of a standard page address after a ‘?’ For example, we may set up external links to a page with tracking parameters (e.g. www.bmon.co.uk/?utm_source=daily_email takes visitors to www.bmon.co.uk while telling me they came from an email). Or the search feature on our site may add query string parameters, as might printable versions.
Now, search engines deal in URLs, not pages, so they see all these as separate items. They’re intelligent enough to sort things out eventually, so don’t worry about being ‘penalised’ or anything like that. However, a site only gets a certain ‘crawl budget’ (resources that the search engine will allocate to inspecting the site), so we don’t want that wasted analysing the same page over and over again just because it has different URLs.
We do this by including a tag on every page which tells the search engine the actual URL you want indexed. The ‘canonical’ tag is supported by all the major crawlers, and should be generated automatically by your content management system. But do make sure it is. This useful tool will allow you to enter a sample page from your site and see if a canonical tag is present. If not, it’s easy to fix.