Disk Space Manager

Analyzing, visualizing, and managing disk space

Find and Remove Duplicate Files

clock December 10, 2009 00:33 by author Mark Richards

The modern home or professional computing user’s appetitive for disk space just keeps on growing, primarily because of our ever-evolving ability to digitize critical elements of our personal and business lives.

At home, people are amassing large volumes of documents, images, music files, and videos. In particular, modern digital photo and video cameras produce very large, high-resolution content files that consume considerable amounts of disk space. And how often have you downloaded and saved a file (be it an application, image, video, or whatnot) a second (or even third) time simply because you couldn’t quickly find the original that you knew existed somewhere on your hard drive?

In the business arena, we’re working to digitize our document libraries to improve efficiency and reduce use of paper products, often creating duplication along the way. We’re maintaining large image and video repositories for archival and analysis purposes. And, we’re keeping these items in storage areas (often accessed through a local area network), portions of which are often allocated to users who download (or exchange via email) and store the same files repeatedly.

In both the home and the office, the use of inexpensive external file storage mechanisms is also increasing. It’s not uncommon to see people carrying around one or more USB keys and copying files onto various desktop and notebook computers for immediate use. Many home and business computing users also utilize one more larger external (eSATA, USB, or FireWire) storage volumes. All of these storage devices can quickly become cluttered with multiple copies of identical files.

And duplicate file storage has serious consequences. Not only is disk space used unnecessarily, duplicate files can also deteriorate the performance of backup processes and – in the case of the frequently-used online backup storage option – increase their expense considerably. The presence of duplicate files can also impede business collaboration when users discover multiple copies of seemingly identical documents, as well as impacting file system searches and archival processes.

Duplicate File Detective 3 is a software product designed to help users find and remove duplicate files. It can operate against local drives, network attached storage volumes, removable devices and more – and it search them all at the same time. What’s more, it contains powerful tools to help remove or archive duplicate files quickly, and safely.

With Duplicate File Detective 3, you can:

  1. Reclaim wasted local and network storage resources quickly and efficiently
  2. Speed up backup processes by reducing storage allocation redundancy
  3. Gain visibility into what types of duplicates are consuming space and who owns them
  4. Process (move, delete, or zip) duplicates safely with our built-in file management system
  5. Scan and de-dupe file systems of virtually any size with our extreme scalability engine
  6. Find duplicates by any combination of attributes, including content-only matching (regardless of file names) 


You can download Duplicate File Detective 3 and begin using the free, fully-functional trial version immediately.



Solid State Drive (SSD) Storage Analysis & Management

clock December 9, 2009 06:23 by author Mark Richards

Our FolderSizes (disk space analysis and reporting) and Duplicate File Detective (duplicate file management) software products are used by many thousands of people all over the world, many of them members of Fortune 500 entities that depend upon them for daily storage management capabilities.

For this reason (and because we're just generally geeky), we have a wide range of hardware in our development and test labs. This includes pretty much every type of removal storage device you can imagine, Internet connected storage mechanisms, large SANs (storage area networks) packed with millions of files (usually for stress and scalability testing purposes), and more. You can't take two steps in any direction around here with tripping over an eSATA or USB drive, and yes - there have been a few minor injuries as a result. But hey - it's all in the name of science! Errrr, software.

Testing all these gadgets (and optimizing our software as a result) is important, though not generally all that exciting. However, one particular class of devices - SSDs (solid state drives) - have really captured our attention as of late. In fact, I have a 128 GB Crucial SSD installed in the laptop that I'm using to write this blog entry. And you know what? They'll be prying it from my cold, dead hands.

The impact that SSDs have on general computing performance is simply startling, especially when coupled with Microsoft Windows 7. I mean, there is just nothing more soul-sucking than sitting around waiting for some clunky 5400 RPM mechanical hard drive to load Visual Studio 2008 - or any other equally large application, for that matter. With SSDs, even the chunkiest of applications launch almost instantly. And because most laptops often have a bit less RAM than their desktop counterparts, they often make heavier use of disk-based paging. This means that installing a solid state drive has a very broad and compelling affect on the performance of the entire machine - no matter what you're doing.

The other thing I love is that I no longer need to be constantly worried about moving my laptop around. The simple fact of the matter is that normal hard drives really shouldn't be moved at all while they're spinning, and they're one of the most common points of failure in notebook computers. My current Toshiba laptop even came installed with a little taskbar applet that alerted me to such movement, and it nagged me constantly. Frankly, this is just one bit of stress I don't need.

Thankfully, SSDs mean no moving parts, and they have amazing shock and vibration tolerance. No more "mobile computing anxiety" for me. And my laptop is even a tad lighter with it.

Oh, and the SSD is absolutely dead silent. This is one of those things that I never really thought I'd care about until I experienced it for myself - the silence is just... beautiful.

Bottom line - if you value performance, reliability, and peace of mind, SSDs are the real deal. Yes, they're more expensive than the old mechanical drives, but I can tell you from personal experience - they're worth it!



What is Cluster Overhang (Disk Slack)?

clock December 9, 2009 00:47 by author Mark Richards

Some software tools, including our FolderSizes disk space analysis utility, are capable of reporting two size metrics for each file system object it encounters - "size" and "allocated size" (the latter is sometimes also called "size on disk"). In this blog entry, I will discuss what they metrics represent, and how they differ.

First, you'll find need to know that disk space is allocated to files in units called clusters. The size of a cluster can vary depending upon a number of actors, including what file system is used (NTFS, FAT32, etc.) and partition size. Most people today running the Microsoft Windows operating system are using NTFS, which has a default cluster size of 4K (4096 bytes).

Since all files are stored within one or more clusters, their "size on disk" (allocated size) is always a multiple of the file system's cluster size. For example, if you are using NTFS with a 4K cluster size, any file containing between 1 and 4096 bytes of data will consume a single cluster. Any file containing between 4097 and 8192 bytes will use two clusters. And so on.

As a result, any file that has a size which is not an exact multiple of the file system's cluster size (and the vast majority aren't) will "waste" a portion of its last cluster. Therefore, a file's "allocated" size will usually be larger than its actual size. This wasted space is usually referred to as "cluster overhang" or "disk slack". Some tools such as our FolderSizes disk space analysis software can also report upon cluster overhang for folders (directories).

A rough estimate of wasted space for a volume can be calculated by multiplying the number of files it contains by half the cluster size. So, for example, if an NTFS file system with 4K clusters contains 50,000 files, the estimated wasted space would be about 97MB of disk space.

Other factors, such as file system compression can also affect the computation of allocated space.