Trying to understand whats going on here?

rowan222 · 25 January at 5:48PM

This concerns a private members society here in the UK. To avoid any copy-rite issues or any forum rules regarding posting content I'm anonymising some of the details. I'm just trying to understand why/how this is happening on a technical level.

This refers to a private members society concerned with a niche area of history. An annual membership is charged and a quarterly journal is published in hard copy and posted to members. A digital archive of all past journal up to present is available to purchase by members only. No copies of the journal are available on their website. I am not a member although interested in the subject area and have considered joining in the past.

I'll refer to the society as "Society" and the journal name as "Journal"

So I was googling around to see if there were any "Journal" copies available online to have a browse. I found none. Continuing with more advanced search techniques I then found a couple of old "Journal" copies as a PDF that I could download.

The web address seemed unusual and was in the format of:

https://xxxx-yyyyyyy.ddns.net:2443/journals/journal150.pdf

(again anonymised and journal(s) is the exact name of said publication)

I started playing around with the web address to see were more. Just entering:

https://xxxx-yyyyyyy.ddns.net:2443/journals/

resulted in a "403 Forbidden"

However I quickly found that simply changing the last part of the web address to another number e.g.: /journal165.pdf

resulted in revealing another downloadable PDF. In fact I can enter any journal number and get a PDF that can be downloaded. In fact any journal published in their 50 year history is available as a PDF download.

I'm familiar with a couple of online archive sites that host a lot of articles some of which I'm sure are unauthorised but this seems very different. It almost seems like someones made a big mistake here. I'm positive "The Society" has not authorised this.

I'd just like to understand what exactly is going on. Any thoughts?

flaneurs_lobster · 25 January at 6:18PM

Someone has tried to set up Dynamic DNS for a site and got it wrong. Likely that the archive is stored on a local server rather than an IP's site.

Instead of stealing their stuff why not just contact The Society and let them know?

This refers to a private members society concerned with a niche area of history.

Is it the Nazis?

rowan222 · 25 January at 6:27PM

flaneurs_lobster said:

Instead of stealing their stuff why not just contact The Society and let them know?

This refers to a private members society concerned with a niche area of history.
Is it the Nazis?

I'm not "stealing their stuff"! I just wanted to find a couple of sample copies online and assess if it was something worth subscribing to. I was not looking for or expecting to find the complete archive. That was a bit of a shock. I'll be informing them as its obviouslyr a security breach.

No nothing to do with the Nazi's, much earlier than that!

Neil_Jones · 25 January at 7:51PM

. I was not looking for or expecting to find the complete archive. That was a bit of a shock. I'll be informing them as its obviouslyr a security breach.

Its not necessarily a "security breach" if you found it in a Google search..

I've found all kinds of old school or society newsletters (typically where Fred Bloggs has been mentioned because he was student of the year or scored the most goals in a football match or whatever) and something in those newsletters happened to contain whatever I'd been searching for. When you go back to the home page none of those are linked because they all published like 5/10 years ago. What it means is it was an active link at some point and Google's found them, but the institute in question never removed the files.

If these journals have been deliberately arranged in a way that is accessible on the website (not necessarily linked from anywhere) then that doesn't make it a security breach either.

Vitor · 25 January at 8:34PM

The likely chain is simple. Someone exposed the PDFs on a publicly reachable HTTPS service on port 2443. At least one direct link leaked somewhere, possibly years ago. Googlebot fetched it, saw a valid PDF, and indexed it. The lack of access control allowed further files to be fetched once their names were known/guessed.

How to prevent this depends on the web-server hosting the, ahem, 'journals'.

rowan222 · 25 January at 11:23PM

Vitor said:

The likely chain is simple. Someone exposed the PDFs on a publicly reachable HTTPS service on port 2443. At least one direct link leaked somewhere, possibly years ago. Googlebot fetched it, saw a valid PDF, and indexed it. The lack of access control allowed further files to be fetched once their names were known/guessed.

How to prevent this depends on the web-server hosting the, ahem, 'journals'.

That sort of makes sense now. It would seem that the server hosting these did not intend them to public for two reasons

1. Theres no direct link to find them, i.e. it's not easy unless you get curious like I did.

2. The server in the format https://xxxx-yyyyyyy.ddns.net: is not the Societies website even remotely.

When I did the google search for "The Society" and "Journal I got hits in the form of:

"No IP" followed by references to a particular journal number. When I clicked on that it loaded that journal as a PDF that could be saved. But only that particular copy.

GDB2222 · 26 January at 4:46PM

Deep linking into a website, in the way you have done it, can be a crime under the Computer Misuse Act. That's especially in the way you have done it, by guessing links that are not published by the website owners.

Unless you are 100% sure the owners of the website will welcome your telling them there's a security flaw, rather than reporting you to the police, it might be prudent simply not to tell them.

Vectis · 28 January at 7:12PM

https://forums.moneysavingexpert.com/discussion/comment/81854914#Comment_81854914

Just a quick google:

'If a website uses predictable, human-readable URL structures (e.g., example.com/report-Q1, example.com/report-Q2), guessing the next in the sequence is often considered reasonable navigation'

Grumpy_chap · 28 January at 8:27PM

https://forums.moneysavingexpert.com/discussion/comment/81858811#Comment_81858811

Isn't that how the Budget got leaked early this year?

GDB2222 · 29 January at 9:32AM

https://forums.moneysavingexpert.com/discussion/comment/81858811#Comment_81858811

Indeed. I’m not saying that the law isn’t daft. :)

Trying to understand whats going on here?

Comments

Confirm your email address to Create Threads and Reply

🚀 Getting Started

Categories

Is this how you want to be seen?

Get our FREE Weekly email full of deals & guides - and it’s spam-free