Poorly phrased discovery requests are expensive for everyone

Kind of an obvious statement, but an object lesson has been provided. In the linked case, the discovery request included the phrase "unallocated space" and included keywords with general meanings.

The result? An unmanageable wad of data only scant slivers of which were 'responsive', and would cost well over a million dollars to find it.

"Unallocated space" is what gives me the shivers. That would require sorting through the empty spots of partitions looking for complete or partial files and producing those files and fragments. I know WWU wasn't equipped for that kind of discovery request, and we'd be knock-kneed about how to handle "unallocated blocks" on the SAN arrays themselves. It would suck a lot.

And in this case it cost quite a bit to produce in the first place.

But this also shows the other side of the discovery request, the expense of sorting through the discovered data. My current employer does just that; pare down discovered data into the responsive parts (or make the responsive parts much easier to find during manual review). And yes, it costs a lot.

Pricing is somewhat complicated, but the dominant model is based on price-per-GB with modifiers for what exactly you want done with the data. OCR costs extra. Transformation into various industry-common formats costs extra. That kind of thing. The price has been dropping a lot lately, but it's still quite common to find prices over $200/GB, and very recently prices were hovering around $1,000/GB.

Many sysadmins I know pride themselves in their ability to phrase search queries into Google to get what they're looking for. It doesn't take long to locate exactly what we're looking for, or some hint on where to look next.

Lawyers have to get the search query right on the first try. Laziness (being overly broad) costs everyone.