Hot laptop slowing Excel?

IronWolf · 16 April 2015 at 5:10PM

Ok, for anyone that is interested I seem to have fixed the problem (again!)

I think it was the clipboard getting too big due to some copy and pasting, I've added a line to clear the clipboard on every iteration and its back to under 1s.

Fingers crossed it stays that way this time.

paddyrg · 16 April 2015 at 5:40PM

1) why create a connection and destroy it again over and over?
2) Looping? Procedural code is the worst way of dealing with datasets. If the data is in a local SQL Server db, why not use its incredibly powerful setwise data handling?
3) Producing files? Have a look at whatever they call DTS these days - you may well be able to do the whole thing in a few heartbeats and without excel, VBA, looping.

Just some thoughts

IronWolf · 16 April 2015 at 8:52PM

paddyrg wrote: »

1) why create a connection and destroy it again over and over?
2) Looping? Procedural code is the worst way of dealing with datasets. If the data is in a local SQL Server db, why not use its incredibly powerful setwise data handling?
3) Producing files? Have a look at whatever they call DTS these days - you may well be able to do the whole thing in a few heartbeats and without excel, VBA, looping.

Just some thoughts

1) Because the connection only takes 0.1s to establish, grab the data, and put it into the sheet. It doesn't make any meaningful difference to the run time

2) SQL is simply incapable of arranging the data how I need it as its a database program, not an analysis package.

3) As above, SQL just isnt designed for doing what Im doing. Excel does it perfectly, and only takes a few hours.

paddyrg · 17 April 2015 at 6:33AM

OK, just in a corporate environment it's often a sign of a homebrew system that with a bit of a redesign can be made rugged and faster. No idea about your circumstances, but oftentimes there are really good reasons why a review would be a good idea - for instance all those connection creations and destructions overhead would have an impact over the database at the receiving end too, and that can be unpopular over a corporate LAN. It's also a warning flag for a system which becomes a part of the business' core processes but which is being farmed off to a random laptop which can fail/get infected/etc. It's also a flag for the system being shared with other users, creating a shadow IT support requirement (originator leaves company, IT depot start getting call about some spreadsheets thing, and find undocumented, sprawling mess of dependencies). This is the reason why ie6 is so stubbornly hard to get rid of in companies, despite the massive security risks. It's also a flag for people getting excited by piles of data and automation but not understanding it deeply enough to be useful - the tools make it easy to make very compelling looking reports but the underlying data model isn't correct.

I have seen all of these when I worked as a data migration consultant for Microsoft. I've seen companies making decisions based on compelling analysis of incomplete data that they never realised was incomplete because some guy in accounts worked out how to play with the tools and was never questioned. I've seen departments consistently under bill because their pretty reports were based on loops where the use had learnt some nested loops on VBA, but didn't understand edge cases and 0 vs 1-based indices and why it mattered. If a database has dynamic SQL possible to internal users, it has the strong risk of injection attacks, deliberate or accidental. Like I say I don't know your company, maybe this is just a personal DVD collection type of thing, or maybe you're an experienced DBA, but just sharing.

Anyway, back to 1) creating and destroying connections may have an impact if they're not being closed correctly on the server, which will have to wait for a timeout to release another connection. It would have the symptom you describe. It may only add a fraction of time itself to the client, but if it does it multiple times unnecessarily it suggests copy/paste coding which is a flag of other risks. If the data connection is from VBA, it could be ADO or even DAO, it could even be an early version. Any case, opening and closing multiple connections is less efficient than opening a connection, asking everything you want to ask, and closing.

You mention inserts, if you mean inserts of data into a SQL Server database table itself, then the database structure can mean inserts will get shower, depending on your clustering design, page fill, indexing and logging. Inserting a row here or there isn't a big deal if the page fill is low enough to accommodate it, but if you're tearing pages left and right then that makes a lot of heavy lifting on the server, especially if indices need rebuilding as you go. This would also explain your timings issue.

The server may not be able to display the data how you wish, I totally get that, but it is brilliant at getting it into a state to be displayed by whatever business logic you want to apply thereafter. Maybe you're right and the data, through whosever fault the system design was, or through inadequate specifications from the business, is in an unusable mess where odd values need to be looped through to pick from various tables and transformed at the client. It happens. It's just flag after flag that there is a bigger solution here. The 5 second thing isn't the problem, it's a symptom of a bigger problem either at the database end being badly designed or a cultural one of implementing a bad design and covering for it with layers of skunk works.

Like when you go to the doctor complaining of a cough and asking for a tablet to soothe it, there is also an underlying condition which could be cured not covered up. The cough could be a simple cold and will go in a couple of days. It might be because of a sensitivity, which is ready to treat. It might be a fishbone, simple to treat but no tablet is going to help. It could be emphysema, maybe there's a lung growth, you get the idea. I don't actually care about your work or spreadsheets, I'm just seeing a symptom and using professional experience to suggest that the symptom isn't the real problem, and that there are probably good ways to address that problem. But I'm not getting involved - hope it goes well for you and I'll butt out!

IronWolf · 17 April 2015 at 7:55AM

paddyrg wrote: »

OK, just in a corporate environment it's often a sign of a homebrew system that with a bit of a redesign can be made rugged and faster. No idea about your circumstances, but oftentimes there are really good reasons why a review would be a good idea - for instance all those connection creations and destructions overhead would have an impact over the database at the receiving end too, and that can be unpopular over a corporate LAN. It's also a warning flag for a system which becomes a part of the business' core processes but which is being farmed off to a random laptop which can fail/get infected/etc. It's also a flag for the system being shared with other users, creating a shadow IT support requirement (originator leaves company, IT depot start getting call about some spreadsheets thing, and find undocumented, sprawling mess of dependencies). This is the reason why ie6 is so stubbornly hard to get rid of in companies, despite the massive security risks. It's also a flag for people getting excited by piles of data and automation but not understanding it deeply enough to be useful - the tools make it easy to make very compelling looking reports but the underlying data model isn't correct.

I have seen all of these when I worked as a data migration consultant for Microsoft. I've seen companies making decisions based on compelling analysis of incomplete data that they never realised was incomplete because some guy in accounts worked out how to play with the tools and was never questioned. I've seen departments consistently under bill because their pretty reports were based on loops where the use had learnt some nested loops on VBA, but didn't understand edge cases and 0 vs 1-based indices and why it mattered. If a database has dynamic SQL possible to internal users, it has the strong risk of injection attacks, deliberate or accidental. Like I say I don't know your company, maybe this is just a personal DVD collection type of thing, or maybe you're an experienced DBA, but just sharing.

Anyway, back to 1) creating and destroying connections may have an impact if they're not being closed correctly on the server, which will have to wait for a timeout to release another connection. It would have the symptom you describe. It may only add a fraction of time itself to the client, but if it does it multiple times unnecessarily it suggests copy/paste coding which is a flag of other risks. If the data connection is from VBA, it could be ADO or even DAO, it could even be an early version. Any case, opening and closing multiple connections is less efficient than opening a connection, asking everything you want to ask, and closing.

You mention inserts, if you mean inserts of data into a SQL Server database table itself, then the database structure can mean inserts will get shower, depending on your clustering design, page fill, indexing and logging. Inserting a row here or there isn't a big deal if the page fill is low enough to accommodate it, but if you're tearing pages left and right then that makes a lot of heavy lifting on the server, especially if indices need rebuilding as you go. This would also explain your timings issue.

The server may not be able to display the data how you wish, I totally get that, but it is brilliant at getting it into a state to be displayed by whatever business logic you want to apply thereafter. Maybe you're right and the data, through whosever fault the system design was, or through inadequate specifications from the business, is in an unusable mess where odd values need to be looped through to pick from various tables and transformed at the client. It happens. It's just flag after flag that there is a bigger solution here. The 5 second thing isn't the problem, it's a symptom of a bigger problem either at the database end being badly designed or a cultural one of implementing a bad design and covering for it with layers of skunk works.

Like when you go to the doctor complaining of a cough and asking for a tablet to soothe it, there is also an underlying condition which could be cured not covered up. The cough could be a simple cold and will go in a couple of days. It might be because of a sensitivity, which is ready to treat. It might be a fishbone, simple to treat but no tablet is going to help. It could be emphysema, maybe there's a lung growth, you get the idea. I don't actually care about your work or spreadsheets, I'm just seeing a symptom and using professional experience to suggest that the symptom isn't the real problem, and that there are probably good ways to address that problem. But I'm not getting involved - hope it goes well for you and I'll butt out!

Thanks for the advice, you obviously know this subject well.

This is all being run from my laptop (database is stored locally) so no issues with slowing down a network or anything. This is for my own company and Im on my own at the moment. The final files will be interrogated in a lot of detail by customers so if anything has gone wrong I will probably quickly find out about it, I manually check random ones myself too.

There probably is a more sophisticated and efficient way to do it, but as its only taking a few hours it wouldn't really make any difference to me if it only took 10 minutes. My first batch is 10k files but after this it will be at most 1k each week so we're only talking under 30 minutes processing time anyway.

The point about opening new connections etc is interesting, but tbh I don't know off the top of my head how to keep a connection open and use it multiple times so would need get on Google and figure it out

paddyrg · 17 April 2015 at 11:12AM

IronWolf wrote: »

Thanks for the advice, you obviously know this subject well.

This is all being run from my laptop (database is stored locally) so no issues with slowing down a network or anything. This is for my own company and Im on my own at the moment. The final files will be interrogated in a lot of detail by customers so if anything has gone wrong I will probably quickly find out about it, I manually check random ones myself too.

There probably is a more sophisticated and efficient way to do it, but as its only taking a few hours it wouldn't really make any difference to me if it only took 10 minutes. My first batch is 10k files but after this it will be at most 1k each week so we're only talking under 30 minutes processing time anyway.

The point about opening new connections etc is interesting, but tbh I don't know off the top of my head how to keep a connection open and use it multiple times so would need get on Google and figure it out

Hiya, I'm so sorry as the bulk of my wall of text would only apply at a corporate level

. In the circumstances, frankly, carry on as you are - you know your data so will naturally be sanity-checking it as you go... And as it's your business you're ideally placed to know your data! And you're not going to leave the company with some other poor soul trying to untangle your code. It just sounded like a common situation in middle-to-large corporate and I jumped the gun!

The fact that you're talking about so few rows means that whilst yes there are going to be a whole heap of things that could be optimised, it's not worth it going forwards. It's not even cost-effective to look! Anything involving 10k rows is small enough that a re-designed system would be as close to instant as to make no difference - but if you can just run it in downtime, there's no advantage.

To give an example of how a system review can make a difference is a company I consulted for that handles electricity meter details - industrial supplies measure consumption every half hour instead of every three months, so that's 17500 or so data points to manage per meter. The old system used nested loops in a certain framework, each meter took 30-45 minutes to analyse. By rejigging the framework, they'd analyse in 5 minutes. When we went back to the original specs and rewrote the job as setwise SQL, each meter took less than half a second! And better still, the new code was correct (manually verifiable) where the old code was so complicated and actually gave wrong results - but nobody understood the data well enough to realise or check. For them, it literally saved the company as they only had 3 trading weeks of capital left, now we're 15 years on and they're buying all their competitors

. OK, extreme example, but you can see why if you were in a corporate that I felt the need to explain the warning flags!

Hope it goes well

Hot laptop slowing Excel?

Comments

Confirm your email address to Create Threads and Reply

🚀 Getting Started

Categories

Is this how you want to be seen?

Get our FREE Weekly email full of deals & guides - and it’s spam-free