Tuesday, July 9, 2013

Principles of VDW development

Spitballing for a paper I'm (hopefully) helping to write.

The principles informing development of the VDW are:
  • The VDW exists to facilitate substantive health services, epidemiological, and health economics research, rather than advance the fields of medical informatics or computer science.  Thus we take a very pragmatic (and often not very sexy) approach to data sharing.  Results are always favored over methods.
  • Participation must not impair organizations' ability to protect patients/insureds as the human subjects of research, or abide by applicable laws and regulations (HIPAA, etc.).  Participation must also not impair organizations' ability to protect the interests of both local researchers and their parent organizations (i.e., the health care/insurance providers).
  • The data standards must be open and publicly available so that any interested organization with relevant data can implement and potentially collaborate with other implementing organizations.
  • The data standards represent a "floor" rather than a "ceiling"—that is, implementers are free to embellish or enhance data structures in ways that do not break compatibility with the base VDW specifications, in order to increase the value to local researchers.
  • No "least common denominator" specifications.  Participating organizations have varying levels of detail in their local data.  Rather than discard all but the lowest level of detail available across all implementing sites, the best data specifications accommodate detail and make it optional. For example, if some organizations record the precise day that an insured disenrolls, but others always pad this out to last day of the month, the best specification will allow both organizations to put their data in VDW form.
  • Participation can be partial—an organization can choose to implement 5 out of the 11 data areas for example.  Their attractiveness as a collaborator is thus diminished of course, but they are still participants in the VDW process.
  • Central coordination, but not authority.  Participation is voluntary by all organizations.  There is no central source of funding, and therefore no central authority directing the development of the VDW.  Decisionmaking is by rough consensus with the goal of serving the greatest number of existing and foreseeable projects, while not overtaxing resources available at the sites.
The problem w/the 3rd bullet there is that there's no central clearinghouse where a non-HMORN member org can advertise their implementation & hold themselves out as potential collaborators.

I might add:
  • The data is never going to be perfect.  Like all data collected for any reason, VDW data has problems, not all of which we know about.  Problems can be introduced both in the local operational business systems from which it originates, and in the work implementers do to transform it into VDW data.  We are committed to a transparent process by which problems are reported, prioritized, and fixed.  But users should not assume that the data is pristine and suitable for their intended use--caveat user, and do please report issues.
  • VDW is not the end-all/be-all of research data warehouses.  We are  eager to (strategically) expand the domain of projects for which VDW data alone are sufficient--where current or future projects stand to benefit.  But potential users should not assume that a given study can be carried out using nothing but VDW data.  Custom programming is sometimes necessary, and some projects can't be carried out at all.  Also, "my project needs new variable X" is not a reason for us to drop everything we're doing and implement variable X.
What am I forgetting?  What could be said better?