Blog(ik)

Subscribe To Our RSS Feed

Insightful and useful eDiscovery tips, triks and thoughts from Logik Systems

In the essence of sharing and giving back to the legal community we have decided it would be a good idea to share some helpful tips, triks and thoughts on all things eDiscovery.

We hope you like it and please feel free to comment on any of the postings (note: the comments are moderated to avoid spam, BUT you do NOT have to sign-up to post a comment, so edd_yoda, feel free to use whatever internet alias you like :)

Tip: Do Vertical De-duplication Correctly

Vertical de-duplication (i.e. identifying duplicate documents within only 1 custodian’s data) is not as efficient as horizontal de-duplication (i.e. identifying duplicate documents across 1 or more custodian’s data), but it is sometimes required by various agencies or even clients themselves.

To do vertical de-duplication correctly, you first need to start with an organized data structure. Starting with the exact spelling of the custodian’s name (preferably Last-name, First-name and middle initial, if you can get it, format) as the top level directory to where you will place the custodian’s data set(s).

Any data collected for the custodian should be placed within the corresponding custodian directory and easily identified (such as File Server, Exchange Email, Blackberry, etc.). The goal of the sub-directories is to keep track of where the data originally came from. You could do this backwards with the data sources being the top level structure and the custodian names as the sub-directories, but that becomes confusing and complicated to manage. Using the custodian name as the starting point will make things easier in the collection and processing stages. If done correctly, your structure should look something like this:

Keeping a log of all the custodian names is a crucial part of performing accurate vertical de-duplication, as well as logging the data sources (i.e. Exchange Email, File Server, etc.) each custodian contains. This is important because data is generally collected in batches during discovery. So, if a custodian has new data to be collected and processed, but the majority of the data has already been collected and processed, spelling the person’s name exactly as it was spelled before makes it very easy for de-duplication software to compare the new data against the old data. If this information is not provided, then most de-duplication software applications can not easily detect who created the data and it may treat the new data as a completely new custodian’s data, so none of the new data will be de-duplicated against the old data.

Following these simple steps will not only help you organize your data better, but it will make the overall eDiscovery process a little bit smoother for you and your preferred eDiscovery provider.

Posted by Andrew Wilson on January 1, 2008 | Permanent Link | Post a Comment

Thought: The "Un-compressed" Processing Myth

The eDiscovery market is booming and companies are springing up all over the place offering various services from collections to processing to hosting, etc. With this increasing pool of vendors comes very different and sometimes confusing pricing models.

One of the most common models we come across in competitive bids is charging customers for the “un-compressed” or “extracted” size. What does this mean exactly? It means that what you think may be 100GB of email could in fact become 200GB of email and attachments or more AFTER the data is processed. Essentially, this could double or even quadruple your eDiscovery budget.

Don’t let this happen to you! Vendors that use this model are providing their clients with misleading information. The “un-compressed” size is a myth. For example, lets say you have an email with 10 embedded attachments that total 10MB (megabytes). When that email is processed with an vendor’s eDiscovery software, generally, it will extract each individual attachment as it’s own record as well as the original email with the 10 embedded attachments. The email alone will be ~10MB and the extracted attachments will also total 10MB, so the total size of that email and it’s extracted attachments has more than doubled to ~20MB. The true size of the email AND it’s attachments is actually ~10MB, because the email size is inflated due to the embedded attachments. This is why some vendors charge for the “un-compressed” size, because on the surface it actually looks like the data HAS doubled, when in fact nothing has really changed.

As more clients become educated to the sometimes very confusing world of eDiscovery pricing this model will eventually go away. For now, be on the lookout for this model and ask your vendors if they charge on the total and “compressed” size or if they follow the “un-compressed/extracted” model. Knowing how they price ahead of time could keep your eDiscovery budget in check.

Posted by Andrew Wilson on November 8, 2007 | Permanent Link |

Trik: "Soft Delete" in Lotus

Lotus Notes databases (i.e. NSF files) have a somewhat annoying feature that can cause major headaches for litigation support personnel. The feature is called “soft delete”. It works like this:

When a Lotus Notes user deletes a record inside of an NSF database that record is moved to the trash bin until it is permanently removed by the user or by the “soft delete” feature. By default, the “soft delete” feature has a 48 hour time limit for all records in the trash bin. After a record has been in the trash bin for more than 48 hours the record is permanently removed and unless the user has a backup of the database it will be extremely difficult if not impossible to get that message back.

If the items in the trash bin are important to your case, then you can use this trik to keep the “soft delete” feature from permanently removing the records:

1. Before you open the copied (Notes: you should always work off of copies, never the originals) NSF, change the system date of your computer to a few years ago;
2. Open the NSF database, the items in the trash bin should be intact;
3. Now, inside the NSF go to File >>> Database >>> Properties. Go to the Beanie tab (last tab). Set the “Soft Delete Expire Time in Hours” to something like 999,999 and click the arrow to accept the new value;
4. The “soft delete” feature has been triked and you can now close the NSF, reset your system date to the current month, day, year, and feel good that your trash bin has not been altered.

Read more about this feature here: Lotus Soft Delete

Posted by Andrew Wilson on October 10, 2007 | Permanent Link | Post a Comment

Thought: 5 (un-biased) Things to Consider When Choosing Your Next eDiscovery Provider

The Sedona Principles(note: this is a PDF link) gives an extremely detailed rundown of what to look for in a qualified eDiscovery provider, but for most people it might be a little overboard. Attempting to be as un-biased as possible, here are 5 things to consider before taking the plunge with a new eDiscovery provider:


  1. Qualified references: A few phone calls to an eDiscovery providers clients will give you more information about the provider than you could ever hope for, good and bad.
  2. Upfront pricing: You say, “How much is all this going to cost me?” If the answer is, “Well, it depends…” hang up the phone and look elsewhere.
  3. Knowledgable sales staff: Your life will be much easier if your dedicated account person knows what he/she is doing and can answer questions on the spot.
  4. Defensible process: Simply asking the question, “Is your eDiscovery process defensible?” will not get you the right response. Try asking more open-ended questions like, “How do you handle evidence once it is received, in detail?”
  5. Capacity and capability: Finally, can the provider handle your workload and do they have experience with the type of data you need processed.


Posted by Andrew Wilson on August 24, 2007 | Permanent Link | Comment [2]

Tip: Safe Transit for Electronic Media

We see it everyday, a hard drive, cd, dvd or any type of electronic storage media containing highly confidential and important evidence transferred around in a cardboard box, jewel case, envelope or nothing at all. The contents of that data can be easily destroyed by scratching, dropping or water damage if not properly taken care of. Loosing that data could potentially cost you a client, so why not spend $50 bucks and buy a secure case that is lockable?

A company called Pelican makes air-tight and waterproof containers for sensitive devices like hard drives, guns, cameras, etc. We use them to transfer data back and forth between our clients along with a 4-digit TSA approved combination lock to add more security. The case, depending on the size and where you buy it, will cost you anywhere from $30 to $70 and the locks are sold at $10/lock. A small price to pay considering the alternative…

Posted by Andrew Wilson on August 10, 2007 | Permanent Link |

Tip: Speedy Hard Drive Transfer

Have you ever watched how long it takes to copy 100gb of email onto an external hard drive? I have and it is dreadfully slow even onto a 7,200RPM usb2.0 drive. Luckily there is a new type of connection hardware; eSATA. The advantages of eSATA over USB2.0 are clear, especially for people in the lit support industry who are constantly transferring huge amounts of data from hard drive to hard drive. An eSATA external hard drive can reach transfer speeds up to 3Gbits/second(~375megabytes/sec). That’s right, 375 MEGABYTES PER-SECOND! At that speed copying 1 gigabyte will take ~3seconds to complete. USB2.0, depending on your drive speed and various other interferences reaches speeds up to 480Mbits/second(~60megabytes/sec).

Your next hard drive purchase should be an eSATA one. Things you will need: 1. eSATA card and cable 2. eSATA external hard drive. The transfer speed increase will be significant enough to save you tons of time copying data. While you are at it, use a dos copy application like XXCOPY or Robocopy to handle the copying and stay away from MS drag and drop maneuvers as that will only slow things down.

Posted by Andrew Wilson on August 7, 2007 | Permanent Link | Comment

Tip: Text Editors

In the world of eDiscovery we deal with a LOT of text, sometimes gigabytes worth and relying on good old notepad just doesn’t cut it. Our favorite text editor is TextPad. It does just about everything you want a text editor to do; lightning fast search/replace, sorting, column selecting, etc.

The trial version of TextPad is fully functional, but in order to remove the annoying pop-up each time you open the application you will need to purchase it($16.50/license). If you encounter text files that are larger than 2gb then TextPad is NOT the choice for you. Check out LargeEdit by Netlegger Systems. This editor will open extremely large text files wtih ease and costs ~$25 bucks.

Posted by Andrew Wilson on August 1, 2007 | Permanent Link | Comment

Thought: Is eDiscovery a Commodity?

A few weeks ago I was in a meeting with a well known national computer forensic company(not named for obvious reasons) talking about a potential partnership opportunity between our two companies. The director of the Washington, DC office was explaining to us how their firm is like the “Cadillac” of computer forensics, etc. etc. when she blurted out, “Well, you know, what you do is really becoming a commodity in the industry.” I immediately thought to myself, “Are we a commodity? Is it true that we are no different than every other company selling the same services?”. Panic almost started to seep in, but then I realized how misguided that statement was. Of course we are not a commodity. Every company providing eDiscovery services is not a commodity, even the ones using the exact same third party technology do not qualify as a commodity.

So, what is a commodity?

According to Wikipedia:

“A commodity is something for which there is demand, but which is supplied without qualitative differentiation across a given market. Characteristic of commodities is that their prices are determined as a function of their market as a whole. In essence, commoditization occurs as a good or service becomes undifferentiated across its supply base by the diffusion of the intellectual capital necessary to acquire or produce it efficiently. As such, many products which formerly carried premium margins for market participants have become commodities, such as generic pharmaceuticals and silicon chips.”

When eDiscovery services and products can no longer be easily differentiated from one another, then, yes eDiscovery will become a commodity, but that is far from happening anytime soon. The same holds true for any other professional service.

Posted by Andrew Wilson on July 25, 2007 | Permanent Link |

Trik: Preserve Those Dates/Times

It is pretty easy to change the timestamps on electronic documents; simply drag or copy one file to another location on your computer and presto, the file has a new created date/time. Modify that same file and it will have a new modified date/time. When dealing with electronic evidence it is important NOT to do just that, but more often than not it happens. To prevent future disaster the next time you are copying evidence you or your client can use a free and easy-to-use tool; Robocopy.
For a list of commands you can use with Robocopy, click here

Robocopy, by default, will preserve the created and modified dates/times of every file. It has some other useful commands like copying or excluding selected file types(very useful for eDiscovery).

If you aren’t comfortable with writing in command line, then you could use WinRAR to compress the evidence into one file, but be sure to set the “preserve dates/times” option before compressing. WinZip is not suggested here because it does not preserve both created and modified dates/times.

There are many copy utilities out there that will accomplish the same tasks, some free, some not-so-free, but these two solutions will do a good job at preserving important meta-data and keeping your evidence spoil-free.

Posted by Andrew Wilson on June 24, 2007 | Permanent Link |

Tip: Get Rid of Those Monitors!

Although computer monitors have come down in price over the years they still aren’t very cheap, especially if you need a lot of them. For many eDiscovery providers, whether in-house or outside vendors rely on many different kinds of servers and pcs to process data. Traditionally, each machine is tied with 1 monitor in a 1-to-1 ratio. This method is inefficient and costly.

VNC technology is definitely not new to IT, but many people don’t know about it or how to use it. Two of the most popular VNC softwares are Tight VNC and Real VNC. Both offer free versions that will enable you to control multiple desktops using just one monitor; all you need is an IP address or computer name and presto, you are connected to that machine over the network.

You can literally access thousands of machines with VNC software using just one monitor. Note that when you use VNC software be very careful about setting up the security.

Posted by Andrew Wilson on June 15, 2007 | Permanent Link | Comment