We’d like to remind Forumites to please avoid political debate on the Forum.
This is to keep it a safe and useful space for MoneySaving discussions. Threads that are – or become – political in nature may be removed in line with the Forum’s rules. Thank you for your understanding.
📨 Have you signed up to the Forum's new Email Digest yet? Get a selection of trending threads sent straight to your inbox daily, weekly or monthly!
Finding duplicate words.phrases in Word 2016?

Laz123
Posts: 1,742 Forumite


in Techie Stuff
I know about the Find/Replace option which is good if you know the word you're looking for. But is there a way to locate duplication words automatically in a 139,000 word document which you don't know have been replicated?
TIA
TIA
0
Comments
-
File-Options-Proofing-Flag Repeated Words.
But won't pick up phrases as referred to in title.0 -
WaywardDriver wrote: »File-Options-Proofing-Flag Repeated Words.
But won't pick up phrases as referred to in title.
That option is already ticked and on spell-check doesn't seem to be operational.0 -
The following method goes through the entire document and finds duplicate words. Hitting the replace all button removes the duplicate words in the document. If that is what you want.
1. In Word Press Ctrl+H to open the Find and Replace dialog box
2. Click More, then select the Use wildcards option.
3. In the Find field, type: (<[A-Za-z]@)[ ,.;:]@\1>
(as there is a space in this method I suggest you copy the string above and paste it into the Find filed)
4. In the Replace field, type: \1
5. Click Find Next button.
6. Click Replace All button.
Your done and word tells you how many repeated words it found and replaced.
How this works:
• Find: Look for the start of any word (<) made up of any number (@) of letters ([A-Za-z]) followed by a space or punctuation ([ ,.;:]) then repeat that find (@\1) until you can’t find any more words that match the pattern (>).
• Replace: Replace the first element (the first of the duplicate words) with itself (that’s the \1 bit), which effectively deletes the repeated word.
Please let us know how you got on with it.
Hope it helps0 -
The following method goes through the entire document and finds duplicate words. Hitting the replace all button removes the duplicate words in the document. If that is what you want.
1. In Word Press Ctrl+H to open the Find and Replace dialog box
2. Click More, then select the Use wildcards option.
3. In the Find field, type: (<[A-Za-z]@)[ ,.;:]@\1>
(as there is a space in this method I suggest you copy the string above and paste it into the Find filed)
4. In the Replace field, type: \1
5. Click Find Next button.
6. Click Replace All button.
Your done and word tells you how many repeated words it found and replaced.
How this works:
• Find: Look for the start of any word (<) made up of any number (@) of letters ([A-Za-z]) followed by a space or punctuation ([ ,.;:]) then repeat that find (@\1) until you can’t find any more words that match the pattern (>).
• Replace: Replace the first element (the first of the duplicate words) with itself (that’s the \1 bit), which effectively deletes the repeated word.
Please let us know how you got on with it.
Hope it helps
That is good, however I want to choose and pick the replacement words from a thesaurus ie if 'huge' is repeated multiple times I'd like to replace with 'vast' 'enormous' etc.0 -
So If I understand your looking for instances of any word. How do you decide which word your looking for?0
-
So If I understand your looking for instances of any word. How do you decide which word your looking for?
I don't know beforehand which words have been repeated many times in a book of 139,000 words, that's why I'd like to find which ones have been duplicated (some maybe many times) and to be able to change them to alternatives so as to not be too repetitive. If that makes sense.0 -
Sounds like you need a proper proof-reader for that.(Although I could be wrong, I often am.)0
-
I don't know beforehand which words have been repeated many times in a book of 139,000 words, that's why I'd like to find which ones have been duplicated (some maybe many times) and to be able to change them to alternatives so as to not be too repetitive. If that makes sense.
Yes it makes sense, just trying to rack my brains about how to go about it. It's a very difficult thing to do so I need some time to think about it.0 -
-
A general method (not specific steps at present...)
Save a copy of the document.
Do a whole document search and replace space with return.
Save doc as a plain text file.
Import into Excel giving one word per row,
Then add a row to put a title in first row of words column eg 'Word'
Select all cells in that column from title to last word of doc and make a pivot table from it using count, sorted by count top down (gives you most frequent words)
Probably filter out the common ones such as a, the, but etc.
Use that as a basis to make your judgement..?
Will give you a word count of individual unique words but you would still need to then find those in Word you wished to change.
You could do something similar in Access.....
Really wonder if it would be worth it rather than finding a proof reader?
Not a great answer but maybe food for thought?
I wonder how those algorithms work that size words according to their frequency.......?0
This discussion has been closed.
Confirm your email address to Create Threads and Reply

Categories
- All Categories
- 351.7K Banking & Borrowing
- 253.4K Reduce Debt & Boost Income
- 454K Spending & Discounts
- 244.7K Work, Benefits & Business
- 600.1K Mortgages, Homes & Bills
- 177.3K Life & Family
- 258.3K Travel & Transport
- 1.5M Hobbies & Leisure
- 16.2K Discuss & Feedback
- 37.6K Read-Only Boards