We'd like to remind Forumites to please avoid political debate on the Forum. This is to keep it a safe and useful space for MoneySaving discussions. Threads that are - or become - political in nature may be removed in line with the Forum’s rules. Thank you for your understanding.

Excel formula help

Hi,

I am after some excel help.

I have a large amount of data in a file across three worksheets. I want to anonymise the worksheets by removing an email address, but I want to replace the email address with a code so that it is always replaced with the same code.

For example, I would replace fredbloggs@hotmail.com with 12345 across every worksheet (the email address may appear more than once in each spreadsheet).

I can create a new worksheet with all the email addresses and allocate them codes. Can I then use a formula to replace all the email addresses with the allocated code?


Comments

  • You need a use a hash function. There is not one built into Excel but a Google search will show you how. However, this alone is unlikely to meet the legal definition of "anonymous" but this may not be necessary depending on your needs.
  • lammy82
    lammy82 Posts: 594 Forumite
    Part of the Furniture 500 Posts Name Dropper
    edited 20 August 2020 at 7:11AM
    Simple way, copy and paste all the email addresses into a separate worksheet in a single column. Then remove all the duplicates so you have a list of unique addresses. In the second column put an incrementing lookup value (e.g. 1000 in the first row, 1001 on the second, then select and drag down so it fills all the way down to the bottom of the list).

    Then, in your main worksheets, in columns where you have these email addresses, replace them with a formula using VLOOKUP to pull back the code for that email address. Hopefully these email addresses appear in specific columns in your worksheets so you can just apply the formula to a whole column at a time instead of having to pick through row by row.

    Finally if you want to distribute the data to others without revealing the addresses, you need to do a copy + paste values into a new workbook. The email addresses are not shown on the worksheets but are present in the VLOOKUP formula so make sure that formula doesn't come across to the new workbook that you share with others.
  • DoaM
    DoaM Posts: 11,863 Forumite
    10,000 Posts Fifth Anniversary Name Dropper Photogenic
    Are you using the spreadsheet as a spreadsheet, or more like a flat file database? If the latter then it sounds like an Access database would be better for the task.
  • keithdc
    keithdc Posts: 459 Forumite
    Part of the Furniture 100 Posts Name Dropper Combo Breaker
    Thank you- very helpful. 
    Could certainly import data to access- it will just require some additional faff as the email address removal is really the only processing I need to do before importing in to analysis software. 
  • keithdc
    keithdc Posts: 459 Forumite
    Part of the Furniture 100 Posts Name Dropper Combo Breaker
    You need a use a hash function. There is not one built into Excel but a Google search will show you how. However, this alone is unlikely to meet the legal definition of "anonymous" but this may not be necessary depending on your needs.
    Agree- was using the "anonymous" in the everyday rather than legal context. 
  • WaywardDriver
    WaywardDriver Posts: 546 Forumite
    Seventh Anniversary 500 Posts
    edited 20 August 2020 at 2:57PM
    ?? Use conditional formatting to set text colour of cells with email addresses to background colour i.e. white. Then protect sheet and cells to prevent contents showing in formula bar as per https://www.techonthenet.com/excel/cells/hide_formulas2016.php
    However may not be what you want as cells would appear blank without the anonymised code.
  • rmg1
    rmg1 Posts: 3,148 Forumite
    Part of the Furniture 1,000 Posts Name Dropper
    edited 21 August 2020 at 8:02AM
    lammy82 said:
    Simple way, copy and paste all the email addresses into a separate worksheet in a single column. Then remove all the duplicates so you have a list of unique addresses. In the second column put an incrementing lookup value (e.g. 1000 in the first row, 1001 on the second, then select and drag down so it fills all the way down to the bottom of the list).

    Then, in your main worksheets, in columns where you have these email addresses, replace them with a formula using VLOOKUP to pull back the code for that email address. Hopefully these email addresses appear in specific columns in your worksheets so you can just apply the formula to a whole column at a time instead of having to pick through row by row.

    Finally if you want to distribute the data to others without revealing the addresses, you need to do a copy + paste values into a new workbook. The email addresses are not shown on the worksheets but are present in the VLOOKUP formula so make sure that formula doesn't come across to the new workbook that you share with others.
    Instead of doing the numbers against each email address on the spare worksheet and using VLOOKUP, use the MATCH function (it's built in to Excel so no extra work needed).
    Again, copy the email addresses into a spare sheet and then remove duplicates (removing duplicates may not be necessary, I can't remember.  Might be useful for tidiness.).
    The use "=MATCH(cell_to_check_for,range_to_check,0)"
    This works across worksheets (but not workbooks unless they are open) and would return the row the email address is on in the spare worksheet.
    ::edit::
    Extra info
    The bonus of using the match function is, if it can't find the value you're looking for, you get "#N/A" so you'll know instantly if there are any email address you need to add to your look-up list.
    :wall: Flagellation, necrophilia and bestiality - Am I flogging a dead horse? :wall:

    Any posts are my opinion and only that. Please read at your own risk.
Meet your Ambassadors

🚀 Getting Started

Hi new member!

Our Getting Started Guide will help you get the most out of the Forum

Categories

  • All Categories
  • 348.9K Banking & Borrowing
  • 252.4K Reduce Debt & Boost Income
  • 452.7K Spending & Discounts
  • 241.8K Work, Benefits & Business
  • 618.3K Mortgages, Homes & Bills
  • 176K Life & Family
  • 254.8K Travel & Transport
  • 1.5M Hobbies & Leisure
  • 16.1K Discuss & Feedback
  • 15.1K Coronavirus Support Boards

Is this how you want to be seen?

We see you are using a default avatar. It takes only a few seconds to pick a picture.