We're aware that some users are experiencing technical issues which the team are working to resolve. See the Community Noticeboard for more info. Thank you for your patience.
📨 Have you signed up to the Forum's new Email Digest yet? Get a selection of trending threads sent straight to your inbox daily, weekly or monthly!

extract a list of "deltas" from two lists of textstrings

Options
KingL
KingL Posts: 1,713 Forumite
extract a list of "deltas" from two lists of textstrings

Hi

I have two lists of textstrings, say

LIST 1
a
b
c
d
e
f
g
h
i
j
k
l
m

and

LIST 2
a
d
e
f
j
k
l


I want to extract a list of those items that occur in list 1, but not in list 2.
List 2 is always a subset of List 1, BUT THE CORRESPONDING LINES DON'T LINE UP WITH EACH OTHER. (e.g. textstring "m" might be in row 13 in List 1, but in row 4 in List 2)

How can I do this please? I guess in Excel, but I don't mind what software I use as long as it is available and freeware (Windows 10).

List 2 is always a subset of List 1. A textstring doesn't occur more than once in each list. The lists are already in ascending order.

If the rows lined up with each other, I could do the if(b=a,"match","mismatch") thing, but I don't know how to (automatically) recursively add empty cells into column b until a row of column b matches with the row of column a. I guess it will require Visual Basic, which I don't know.

I would prefer to use Libreoffice over MS Excel if possible. If Excel, would need to be Excel 2013.

Notes:
lines don't match up with each other
each entry doesn't occur more than once in each list
data is already in ascending order in each list
there are no entries in List 2 that aren't in List 1
textstrings contain alphanumerics and underscores only (no special characters other than _)
currently, both lists are in .txt files
if it is not possible to extract the list of deltas, then at least having the rows lined-up so that I can manually see the 'gaps' would be a help


tia
«1

Comments

  • that
    that Posts: 1,532 Forumite
    the short answer is open office calc. use the vlookup function to find if something is in column 1 and is or is not (your choice) in column 2

    long answer may follow soon
  • Cornucopia
    Cornucopia Posts: 16,470 Forumite
    Part of the Furniture 10,000 Posts Name Dropper Photogenic
    Yes, VLOOKUP.

    All of the Excel-lookalike spreadsheets should have this function including Libre Office, Google Sheets and of course Excel itself.

    In terms of syntax:-
    =IF(ISERROR(VLOOKUP(A1,$E$1:$E$50,1,false)),A1,"")
    

    With List 1 in Column A, the above formula all the way down Column B for each entry in Column A, and List 2 in Column E, this will give a Delta list in Column B. (Adjust $E$50 in the above to the actual length of List 2).

    There are various lookup functions in the main spreadsheets so if the above doesn't do exactly what you want, one of the others may do. Things like: MATCH and LOOKUP.
  • unforeseen
    unforeseen Posts: 7,381 Forumite
    Part of the Furniture 1,000 Posts Photogenic Name Dropper
    Or using Excel use conditional format to highlight the duplicates. The ones not highlighted are the ones not in list 2. To make life easier sort by colour as a second stage to bring them all together.
  • Tigsteroonie
    Tigsteroonie Posts: 24,954 Forumite
    Part of the Furniture 10,000 Posts Name Dropper Photogenic
    I'd be doing this with a COUNTIF statement - COUNTIF(range of list two,cell from list one). Result will be '0' if it's not in list two, or '1' if it does appear in list two. Then filter results to show only those with 0 i.e. the entries that only appear in list one.
    :heartpuls Mrs Marleyboy :heartpuls

    MSE: many of the benefits of a helpful family, without disadvantages like having to compete for the tv remote

    :) Proud Parents to an Aut-some son :)
  • unforeseen
    unforeseen Posts: 7,381 Forumite
    Part of the Furniture 1,000 Posts Photogenic Name Dropper
    edited 28 July 2018 at 9:36AM
    Which is easier? Putting that function in one cell, making sure you didn't make an error and it is giving the answer you want, and copying it down or highlight all the cells, click Conditional formatting, click duplicate values, click to accept default colours and all done. Why overcomplicated things by having to write functions for simple tasks where a solution is built in.

    Excel 2013 has the capability and no doubt Libre Office does as well


    K. I. S. S
  • Cornucopia
    Cornucopia Posts: 16,470 Forumite
    Part of the Furniture 10,000 Posts Name Dropper Photogenic
    unforeseen wrote: »
    Which is easier? Putting that function in one cell, making sure you didn't make an error and it is giving the answer you want, and copying it down or highlight all the cells, click Conditional formatting, click duplicate values, click to accept default colours and all done. Why overcomplicated things by having to write functions for simple tasks where a solution is built in.

    Excel 2013 has the capability and no doubt Libre Office does as well
    I think you'll find that the typical Spreadsheet user is much more familiar with Formulas than with the application of conditional formatting you have suggested. I'm sure they both work (though I'm not sure how well supported your method is amongst lookalikes - Google Sheets, for example can apply the formatting, though it requires an embedded formula, and it cannot then sort the list based on its formatting without an add-on).
    K. I. S. S
    Kindly Identifying Spreadsheet Solutions?
  • Andy_L
    Andy_L Posts: 13,017 Forumite
    Part of the Furniture 10,000 Posts Name Dropper
    unforeseen wrote: »
    Which is easier? Putting that function in one cell, making sure you didn't make an error and it is giving the answer you want, and copying it down or highlight all the cells, click Conditional formatting, click duplicate values, click to accept default colours and all done. Why overcomplicated things by having to write functions for simple tasks where a solution is built in.

    Excel 2013 has the capability and no doubt Libre Office does as well


    K. I. S. S

    Depends what the OP wants to do with the answer. If just being able to see the answer is enough then conditional formatting works, if they want to do some more analysis or produce a report then vlookup is more use
  • unforeseen
    unforeseen Posts: 7,381 Forumite
    Part of the Furniture 1,000 Posts Photogenic Name Dropper
    As the choice was between Libre Office and Excel then both programs implement the duplicate function.

    My point is why write formulas when the program you are using has a built in function to do that job? That is just making work for the sake of it

    I would not expect a spreadsheet such as Google Sheets to have that functionality - yet. It is still a maturing product compared to the mainstream products such as Excel & Libre/Open Office
    Kindly Identifying Spreadsheet Solutions?

    Close, but no banana
  • that
    that Posts: 1,532 Forumite
    you could even use word, takes slightly longer, but a lot longer with a bigger list, but no formulas.

    just import the full list, use search and replace to find the duplicate item and replace it with a blank
  • KingL
    KingL Posts: 1,713 Forumite
    Thanks for the suggestions, they are all a lot easier than what I was planning!

    Actually, the VLOOKUP code doesn't work as-is in Libreoffice (I assume that they have a different syntax for VLOOKUP), so the COUNTIF one has got me going more easily. It is easy enough to copy-paste the results as values and then sort on the 1s /0s column to get a contiguous list of the missing text strings (the zeros), which is what I want.

    I will dig around with getting to VLOOKUP to working in Libreoffice when I get more time. I have to do this kind of thing quite often so a range of options is very welcome. :)
This discussion has been closed.
Meet your Ambassadors

🚀 Getting Started

Hi new member!

Our Getting Started Guide will help you get the most out of the Forum

Categories

  • All Categories
  • 350.8K Banking & Borrowing
  • 253K Reduce Debt & Boost Income
  • 453.5K Spending & Discounts
  • 243.8K Work, Benefits & Business
  • 598.6K Mortgages, Homes & Bills
  • 176.8K Life & Family
  • 257.1K Travel & Transport
  • 1.5M Hobbies & Leisure
  • 16.1K Discuss & Feedback
  • 37.6K Read-Only Boards

Is this how you want to be seen?

We see you are using a default avatar. It takes only a few seconds to pick a picture.