Duplicate content is a misunderstood term in search engine optimisation. Some people do believe that if they have the same page available under different URLs, they can get some sort of ‘penalty’ in the search engine rankings. It is possible that if this duplication occurs on a site, search engine rankings could be worse than they might be otherwise, but that’s not some sort of imposed penalty. What happens is that if we have the same content on page A and page B, the search engine will have to decide which one to show and which to ignore. That’s OK, but remember that external links are the ‘currency’ of the search engines. If a page exists under different URLs, the external links may be split across the two, and we can’t guarantee search engines will account for that. We get around this using the ‘rel=canonical‘ tag, which I’ve described before. This indicates to the search engines what the single preferred URL is for the page, and is usually respected.
Manual penalties for duplicate content come about when a site is based purely on content taken from elsewhere on the web. There are estimated to be millions of such sites, using machines to throw together content stolen from elsewhere in the hope of getting traffic which might click on accompanying advertising. The search engines have to identify these sites (which they do, incredibly well), and that’s where the penalty comes in: such sites are just thrown out of the results. But the days when legitimate sites got poor search engine rankings because they repeated their own content are long gone.