Your browser isn't supported
It looks like you're using an old web browser. To get the most out of the site and to ensure guides display correctly, we suggest upgrading your browser now. Download the latest:

Welcome to the MSE Forums

We're home to a fantastic community of MoneySavers but anyone can post. Please exercise caution & report spam, illegal, offensive or libellous posts/messages: click "report" or email forumteam@. Skimlinks & other affiliated links are turned on

Search
  • FIRST POST
    • Robm1955
    • By Robm1955 17th May 19, 9:11 AM
    • 384Posts
    • 108Thanks
    Robm1955
    PDF to Excel
    • #1
    • 17th May 19, 9:11 AM
    PDF to Excel 17th May 19 at 9:11 AM
    I know I have asked about Transpose before, but I'm stuggling with this one. I want to turn a pdf into an Excel spreadsheet, with the coloured text below, as 5 colums, but can't find the quickest way to do it. They seem to have changed the layout of the table/pdf this time.


    FRANKENSTEINS PORTER 5.4 BLACK DB
    Full favoured porter. Strawberries, liquorice, citrus and cofee
    all present. Smooth to the touch with a strong mouth feel.
    ABSTRACT JUNGLE (Langley Mill)
    BLUE BOAR 4.5 RUBY MB1
    Dark with a hoppy aftertaste.
    STURDY IPA 5.6 PALE MB1
    A classic style IPA using pale malt and lots of big hitting
    American hops. Resinous pine and citrusfruits.

    HARE OF THE HILL on hand pump 4.6 GOLDEN DB
    Strawberries and cream. Hazy sunshine with a whisper of
    raspberry. A hoppy golden ale.
    There are 14 pages, all set out like this. There is a heading for the brewery as well, but that doesn't appear on every line, so I can add that later. TIA.
    Last edited by Robm1955; 17-05-2019 at 9:13 AM.
Page 1
    • John Gray
    • By John Gray 17th May 19, 12:33 PM
    • 5,431 Posts
    • 3,223 Thanks
    John Gray
    • #2
    • 17th May 19, 12:33 PM
    • #2
    • 17th May 19, 12:33 PM
    You might make it a bit more intelligible if you surrounded the required text with the tags [ code] and [ /code] (without the blank space after each left-square-bracket).

    As it stands I have no idea where the five columns should be.
    • Cornucopia
    • By Cornucopia 17th May 19, 12:38 PM
    • 11,906 Posts
    • 13,344 Thanks
    Cornucopia
    • #3
    • 17th May 19, 12:38 PM
    • #3
    • 17th May 19, 12:38 PM
    I imagine the OP wants these 5 data items isolated into columns:-

    1/ FRANKENSTEINS PORTER
    2/ 5.4
    3/ BLACK
    4/ DB
    5/ Full favoured porter. Strawberries, liquorice, citrus and coffee all present. Smooth to the touch with a strong mouth feel.

    Unless there are usable separators between the column items that originate from the PDF, then I think it will be tricky to do it fully automatically.

    It might be possible to key onto the number (assuming it is consistent) to split columns 1, 2 & 3. Then maybe a list of the possible codes for item 4?

    edit: Using Regular Expression matching in Google Sheets is probably the easiest way.

    This formula matches everything up to a number that looks like X.X...
    Code:
    =REGEXEXTRACT(B1,"(^.*)[0-9][.][0-9]")
    Last edited by Cornucopia; 17-05-2019 at 3:15 PM.
    I'm a Board Guide on the Phones & TV, Techie Stuff, In My Home,
    The Money Savers Arms and Food Shopping boards. I'm a volunteer to help the boards run smoothly, and I can move and merge threads there. Any views (especially those on the UK TV Licence) are mine and not the official line of moneysavingexpert.com.

    Board guides are not moderators. If you spot an inappropriate or illegal post then please report it to forumteam@moneysavingexpert.com
    • Robm1955
    • By Robm1955 19th May 19, 10:03 AM
    • 384 Posts
    • 108 Thanks
    Robm1955
    • #4
    • 19th May 19, 10:03 AM
    • #4
    • 19th May 19, 10:03 AM
    You might make it a bit more intelligible if you surrounded the required text with the tags [ code] and [ /code] (without the blank space after each left-square-bracket).

    As it stands I have no idea where the five columns should be.
    Originally posted by John Gray
    Sorry if I wasn't clear. Thanks Cornucopia, that's what I want.
    • Cornucopia
    • By Cornucopia 19th May 19, 1:54 PM
    • 11,906 Posts
    • 13,344 Thanks
    Cornucopia
    • #5
    • 19th May 19, 1:54 PM
    • #5
    • 19th May 19, 1:54 PM
    If your source data comes in continuous lines then I think the idea above may be your best chance of splitting it without too much effort.

    As above:

    Code:
    =REGEXEXTRACT(B1,"(^.*)[0-9][.][0-9]")
    will isolate the text in front of any number that is formatted as X.Y.

    If that text is placed in C1, then the formula below will isolate the number itself:
    Code:
    =REGEXEXTRACT(B1,"^" & C1 & "([0-9][.][0-9])")
    This formula isolates everything after the number:-

    Code:
    =REGEXEXTRACT(B1,"^" & C1 & "[0-9][.][0-9](.*$)")
    The next step depends on how many options there are for the code that is "DB" in the first item.
    I'm a Board Guide on the Phones & TV, Techie Stuff, In My Home,
    The Money Savers Arms and Food Shopping boards. I'm a volunteer to help the boards run smoothly, and I can move and merge threads there. Any views (especially those on the UK TV Licence) are mine and not the official line of moneysavingexpert.com.

    Board guides are not moderators. If you spot an inappropriate or illegal post then please report it to forumteam@moneysavingexpert.com
Welcome to our new Forum!

Our aim is to save you money quickly and easily. We hope you like it!

Forum Team Contact us

Live Stats

2,526Posts Today

5,726Users online

Martin's Twitter
  • Mini MSE is on half term next week, so I'm excited to be taking the week off to be daddy. As normal I'm signing of? https://t.co/G3366shWh1

  • I once blurted out on @gmb "Theresa May hasn't been given a poisoned chalice - she's been given a poisoned chalice? https://t.co/onfRbY3XVg

  • It'd be fascinating to know how history will judge Theresa May's premiership. Currently, it is hard to see it as a? https://t.co/eH77G0O9LA

  • Follow Martin